Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composite Part Assembly with no defined assembly method #378

Closed
IsabelMarleen opened this issue Aug 7, 2020 · 19 comments
Closed

Composite Part Assembly with no defined assembly method #378

IsabelMarleen opened this issue Aug 7, 2020 · 19 comments

Comments

@IsabelMarleen
Copy link

This line of code works as in the examples provided:
composition_component.assemblePrimaryStructure(list(all_parts), IGEM_STANDARD_ASSEMBLY)
But if you using iGEM parts and you remove the IGEM_STANDARD_ASSEMBLY to avoid the iGEM scars the following error message appears when you try to write out the document:
'Invalid. sbol-12403:\x00 Strong Validation Error:\x00 The endedAtTime property of an Activity object is OPTIONAL and MAY contain a DateTime. \x00Reference: SBOL Version 2.3.0 Section 12.8.1 on page 73 :\x00 https://synbiohub.org/public/igem/igem2sbol/1\x00 Validation failed.'

It seems there is an activity igem2sbol being added to the composite part that has an invalid end time (potentially only the type is the wrong type).

@tcmitchell
Copy link
Collaborator

Hi @IsabelMarleen, I am unable to reproduce your issue from the information you have provided.

You said that the assemblePrimaryStructure works "as in the examples provided". The only example that includes "IGEM_STANDARD_ASSEMBLY" is tutorial.py. I took that example and modified it to remove the use of sbol2.IGEM_STANDARD_ASSEMBLY and to write out the file. Here is my diff:

$ diff examples/tutorial.py issue378.py 
22,23c22,24
< my_device.assemblePrimaryStructure([promoter, rbs, cds, term],
<                                    sbol2.IGEM_STANDARD_ASSEMBLY)
---
> my_device.assemblePrimaryStructure([promoter, rbs, cds, term])
> 
> doc.write('issue378.xml')

This program runs successfully without any validation error.

One option for you is to turn off validation so that your file will be written and you can inspect the results. To do that, add this to your program:

sbol2.Config.setOption(sbol2.ConfigOptions.VALIDATE, False)

If you can provide additional information, like a short and repeatable test case, we'd be happy to look at this further. Please also be sure to tell us what version of pySBOL2 you are using. We have just released version 1.1 this morning.

@cjmyers
Copy link

cjmyers commented Aug 7, 2020

The activity igem2sbol is downloaded with the parts, since this is the script used to create the iGEM records. I'm not sure why the end date is invalid. I think you might want to do as Tom suggest and turn off validation and look at the end date. It is valid in SBH, so it must become invalid either upon download (unlikely) or when it is brought into pySBOL (more likely I think).

@tcmitchell
Copy link
Collaborator

@IsabelMarleen I was hoping the information above from @cjmyers would help me reproduce the issue if it was indeed a problem parsing dates from SynBioHub. It doesn't look like it's that simple. Here's a program to download igem2sbol (and a good example of what I mean by a short, reproducible test case). The document that gets generated appears to pass validation and appears to have the correct (and valid) date.

import sbol2

ps = sbol2.PartShop('https://synbiohub.org')
doc = sbol2.Document()
ps.pull('https://synbiohub.org/public/igem/igem2sbol/1', doc)
doc.write('issue378.xml')

Resulting SBOL file:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#" xmlns:xsd1="http://www.w3.org/2001/XMLSchema#dateTime/" xmlns:om="http://www.ontology-of-units-of-measure.org/resource/om-2/" xmlns:synbiohub="http://synbiohub.org#" xmlns:sbh="http://wiki.synbiohub.org/wiki/Terms/synbiohub#" xmlns:sybio="http://www.sybio.ncl.ac.uk#" xmlns:ncbi="http://www.ncbi.nlm.nih.gov#" xmlns:igem="http://wiki.synbiohub.org/wiki/Terms/igem#" xmlns:genbank="http://www.ncbi.nlm.nih.gov/genbank#" xmlns:gbconv="http://sbols.org/genBankConversion#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:obo="http://purl.obolibrary.org/obo/">
  <prov:Activity rdf:about="https://synbiohub.org/public/igem/igem2sbol/1">
    <dc:creator>James Alastair McLaughlin</dc:creator>
    <prov:endedAtTime>2017-03-06T15:00:00+00:00</prov:endedAtTime>
    <dcterms:title>iGEM to SBOL conversion</dcterms:title>
    <sbol:displayId>igem2sbol</sbol:displayId>
    <dc:creator>Chris J. Myers</dc:creator>
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/igem2sbol"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <dcterms:description>Conversion of the iGEM parts registry to SBOL2.1</dcterms:description>
    <sbol:version>1</sbol:version>
  </prov:Activity>
</rdf:RDF>

@IsabelMarleen
Copy link
Author

@tcmitchell Thank you for your comments. I use the current 1.1 version of sbol2. I generated the following minimal example which yields an xml file and the validation error only occurs when it is uploaded to SynBioHub, when IGEM_STANDARD_ASSEMBLY is not added to composition.assemblePrimaryStructure( ):

import sbol2

doc = sbol2.Document()
igem = sbol2.PartShop('https://synbiohub.org/public/igem')
igem.pull('BBa_I719005', doc)

composition = doc.componentDefinitions.create('composition')
composition.assemblePrimaryStructure(['BBa_I719005', 'BBa_I719005'])
doc.write("minimal_example")

The resulting xml file seems normal and there is no discernible problem with the date format which throws the error. The section below is the relevant excerpt from the full xml file:

  <prov:Activity rdf:about="https://synbiohub.org/public/igem/igem2sbol/1">
    <dcterms:description>Conversion of the iGEM parts registry to SBOL2.1</dcterms:description>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <dc:creator>Chris J. Myers</dc:creator>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbol:displayId>igem2sbol</sbol:displayId>
    <prov:endedAtTime>2017-03-06T15:00:00+00:00</prov:endedAtTime>
    <sbol:version>1</sbol:version>
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/igem2sbol"/>
    <dcterms:title>iGEM to SBOL conversion</dcterms:title>
    <dc:creator>James Alastair McLaughlin</dc:creator>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>

@cjmyers
Copy link

cjmyers commented Aug 10, 2020

Isabel can you provide the full SBOL file? Have you tried to validate it using the online validator directly? If so, the next step would to try to validate with libSBOLj in a debug mode and see if we can see what its complaint is. From what I can see, the date is the same in both the case you say is failing and the one that Tom says succeeds. This is very strange.

@IsabelMarleen
Copy link
Author

I just tried validating it using the online validator and the same error about the time format. The full SBOL file is attached below:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#" xmlns:xsd1="http://www.w3.org/2001/XMLSchema#dateTime/" xmlns:om="http://www.ontology-of-units-of-measure.org/resource/om-2/" xmlns:synbiohub="http://synbiohub.org#" xmlns:sbh="http://wiki.synbiohub.org/wiki/Terms/synbiohub#" xmlns:sybio="http://www.sybio.ncl.ac.uk#" xmlns:ncbi="http://www.ncbi.nlm.nih.gov#" xmlns:igem="http://wiki.synbiohub.org/wiki/Terms/igem#" xmlns:genbank="http://www.ncbi.nlm.nih.gov/genbank#" xmlns:gbconv="http://sbols.org/genBankConversion#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:obo="http://purl.obolibrary.org/obo/">
  <sbol:ComponentDefinition rdf:about="http://sys-bio.org/ComponentDefinition/composition/1">
    <sbol:component>
      <sbol:Component rdf:about="http://sys-bio.org/ComponentDefinition/composition/BBa_I719005_1/1">
        <sbol:displayId>BBa_I719005_1</sbol:displayId>
        <sbol:version>1</sbol:version>
        <sbol:definition rdf:resource="https://synbiohub.org/public/igem/BBa_I719005/1"/>
        <sbol:persistentIdentity rdf:resource="http://sys-bio.org/ComponentDefinition/composition/BBa_I719005_1"/>
        <sbol:access rdf:resource="http://sbols.org/v2#public"/>
      </sbol:Component>
    </sbol:component>
    <sbol:type rdf:resource="http://www.biopax.org/release/biopax-level3.owl#DnaRegion"/>
    <sbol:sequenceConstraint>
      <sbol:SequenceConstraint rdf:about="http://sys-bio.org/ComponentDefinition/composition/constraint_0/1">
        <sbol:object rdf:resource="http://sys-bio.org/ComponentDefinition/composition/BBa_I719005_0/1"/>
        <sbol:restriction rdf:resource="http://sbols.org/v2#precedes"/>
        <sbol:persistentIdentity rdf:resource="http://sys-bio.org/ComponentDefinition/composition/constraint_0"/>
        <sbol:subject rdf:resource="http://sys-bio.org/ComponentDefinition/composition/BBa_I719005_1/1"/>
        <sbol:displayId>constraint_0</sbol:displayId>
        <sbol:version>1</sbol:version>
      </sbol:SequenceConstraint>
    </sbol:sequenceConstraint>
    <sbol:version>1</sbol:version>
    <sbol:component>
      <sbol:Component rdf:about="http://sys-bio.org/ComponentDefinition/composition/BBa_I719005_0/1">
        <sbol:persistentIdentity rdf:resource="http://sys-bio.org/ComponentDefinition/composition/BBa_I719005_0"/>
        <sbol:displayId>BBa_I719005_0</sbol:displayId>
        <sbol:version>1</sbol:version>
        <sbol:access rdf:resource="http://sbols.org/v2#public"/>
        <sbol:definition rdf:resource="https://synbiohub.org/public/igem/BBa_I719005/1"/>
      </sbol:Component>
    </sbol:component>
    <sbol:displayId>composition</sbol:displayId>
    <sbol:persistentIdentity rdf:resource="http://sys-bio.org/ComponentDefinition/composition"/>
    <sbol:type rdf:resource="http://identifiers.org/so/SO:0000987"/>
  </sbol:ComponentDefinition>
  <sbol:ComponentDefinition rdf:about="https://synbiohub.org/public/igem/BBa_I719005/1">
    <sbol:role rdf:resource="http://wiki.synbiohub.org/wiki/Terms/igem#partType/Regulatory"/>
    <sbh:bookmark>true</sbh:bookmark>
    <sbol:displayId>BBa_I719005</sbol:displayId>
    <sbol:sequence rdf:resource="https://synbiohub.org/public/igem/BBa_I719005_sequence/1"/>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/BBa_I719005/1"/>
    <igem:discontinued>false</igem:discontinued>
    <igem:status rdf:resource="http://wiki.synbiohub.org/wiki/Terms/igem#status/Available"/>
    <sbol:role rdf:resource="http://identifiers.org/so/SO:0000167"/>
    <igem:experience rdf:resource="http://wiki.synbiohub.org/wiki/Terms/igem#experience/Works"/>
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/BBa_I719005"/>
    <igem:m_user_id>0</igem:m_user_id>
    <igem:group_u_list>_128_</igem:group_u_list>
    <sbol:type rdf:resource="http://www.biopax.org/release/biopax-level3.owl#DnaRegion"/>
    <sbol:version>1</sbol:version>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <igem:owner_id>2097</igem:owner_id>
    <dcterms:title>pT7</dcterms:title>
    <sbh:star>true</sbh:star>
    <sbh:mutableProvenance>---</sbh:mutableProvenance>
    <igem:owning_group_id>9</igem:owning_group_id>
    <igem:partStatus>Released HQ 2013</igem:partStatus>
    <dcterms:created>2007-10-23T11:00:00Z</dcterms:created>
    <sbh:mutableDescription>Just a T7 Promoter</sbh:mutableDescription>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <dcterms:description>T7 Promoter</dcterms:description>
    <prov:wasDerivedFrom rdf:resource="http://parts.igem.org/Part:BBa_I719005"/>
    <prov:wasGeneratedBy rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <igem:sampleStatus>In stock</igem:sampleStatus>
    <dcterms:modified>2015-08-31T04:07:53Z</dcterms:modified>
    <dc:creator>Imperial 2007</dc:creator>
    <igem:dominant>false</igem:dominant>
    <sbh:mutableNotes>None</sbh:mutableNotes>
  </sbol:ComponentDefinition>
  <sbol:Sequence rdf:about="https://synbiohub.org/public/igem/BBa_I719005_sequence/1">
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <prov:wasGeneratedBy rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbol:elements>taatacgactcactatagggaga</sbol:elements>
    <sbol:encoding rdf:resource="http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html"/>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/BBa_I719005_sequence/1"/>
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/BBa_I719005_sequence"/>
    <sbol:displayId>BBa_I719005_sequence</sbol:displayId>
    <prov:wasDerivedFrom rdf:resource="http://parts.igem.org/Part:BBa_I719005"/>
    <sbol:version>1</sbol:version>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
  </sbol:Sequence>
  <prov:Activity rdf:about="https://synbiohub.org/public/igem/igem2sbol/1">
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <dcterms:title>iGEM to SBOL conversion</dcterms:title>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <prov:endedAtTime>2017-03-06T15:00:00+00:00</prov:endedAtTime>
    <dcterms:description>Conversion of the iGEM parts registry to SBOL2.1</dcterms:description>
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/igem2sbol"/>
    <dc:creator>James Alastair McLaughlin</dc:creator>
    <sbol:displayId>igem2sbol</sbol:displayId>
    <dc:creator>Chris J. Myers</dc:creator>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbol:version>1</sbol:version>
  </prov:Activity>
</rdf:RDF>

@tcmitchell
Copy link
Collaborator

tcmitchell commented Aug 10, 2020

There are a number of components to this issue which I think we need to break down. I'm going to start by working towards a minimal example. @IsabelMarleen gave us a program, but it turns out we can make it smaller and still see the problem. Here is the original program:

import sbol2

doc = sbol2.Document()
igem = sbol2.PartShop('https://synbiohub.org/public/igem')
igem.pull('BBa_I719005', doc)

composition = doc.componentDefinitions.create('composition')
composition.assemblePrimaryStructure(['BBa_I719005', 'BBa_I719005'])
doc.write("minimal_example")

When this program is run and the resulting file is handed to the SBOL validator it generates the following report:

sbol-10222: Weak Validation Error:
Each URI contained by the wasGeneratedBys property of an Identified object MUST refer to an Activity object.
Reference: SBOL Version 2.3.0 Section 12.8 on page 70
: https://synbiohub.org/public/igem/BBa_I719005_sequence/1

sbol-10222: Weak Validation Error:
Each URI contained by the wasGeneratedBys property of an Identified object MUST refer to an Activity object.
Reference: SBOL Version 2.3.0 Section 12.8 on page 70
: https://synbiohub.org/public/igem/BBa_I719005/1

sbol-12403: Strong Validation Error:
The endedAtTime property of an Activity object is OPTIONAL and MAY contain a DateTime.
Reference: SBOL Version 2.3.0 Section 12.8.1 on page 73
: https://synbiohub.org/public/igem/igem2sbol/1

Validation failed.

Let's first modify the original program to print the result from Document.write(), which will contain the validation report. Here's the modified program:

import sbol2


doc = sbol2.Document()
igem = sbol2.PartShop('https://synbiohub.org/public/igem')
igem.pull('BBa_I719005', doc)

composition = doc.componentDefinitions.create('composition')
composition.assemblePrimaryStructure(['BBa_I719005', 'BBa_I719005'])
result = doc.write("minimal_example.xml")
print(result)

And here's the result of running the program:

$ python examples/issue378.py       
Invalid. sbol-12403: Strong Validation Error: The endedAtTime property of an Activity object is OPTIONAL and MAY contain a DateTime. Reference: SBOL Version 2.3.0 Section 12.8.1 on page 73 : https://synbiohub.org/public/igem/igem2sbol/1  Validation failed.

Ok, good. So the resulting file isn't really passing validation. It's that the program wasn't telling us about the validation result.

Now let's look at this strong validation error about the endedAtTime property. The first thing to notice is that it doesn't seem to be related to the ComponentDefinition.assemblePrimaryStructure() call so let's see what happens if we make the program shorter and eliminate the call to assemblePrimaryStructure(). Here's a shorter program:

import sbol2


doc = sbol2.Document()
igem = sbol2.PartShop('https://synbiohub.org/public/igem')
igem.pull('BBa_I719005', doc)
result = doc.write("minimal_example.xml")
print(result)

This results in the same validation error. This leads me to believe that assemblePrimaryStructure() as stated in this issue title is a red herring. There's a problem, but it doesn't appear to be related to the assemblePrimaryStructure() method.

@tcmitchell
Copy link
Collaborator

tcmitchell commented Aug 10, 2020

This is a long train of thought so I'm breaking it down into multiple comments. We left off with a smaller working example above. Let's see if we can make it even smaller by only looking at the Activity on which the validation error is reported. Here is a new program which is the same length as before, only pulls the single item in question, and still gives us the error:

import sbol2


doc = sbol2.Document()
igem = sbol2.PartShop('https://synbiohub.org/public/igem')
igem.pull('igem2sbol', doc)
result = doc.write("minimal_example.xml")
print(result)

Again, same error. Here is the SBOL file generated by the above program:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#" xmlns:xsd1="http://www.w3.org/2001/XMLSchema#dateTime/" xmlns:om="http://www.ontology-of-units-of-measure.org/resource/om-2/" xmlns:synbiohub="http://synbiohub.org#" xmlns:sbh="http://wiki.synbiohub.org/wiki/Terms/synbiohub#" xmlns:sybio="http://www.sybio.ncl.ac.uk#" xmlns:ncbi="http://www.ncbi.nlm.nih.gov#" xmlns:igem="http://wiki.synbiohub.org/wiki/Terms/igem#" xmlns:genbank="http://www.ncbi.nlm.nih.gov/genbank#" xmlns:gbconv="http://sbols.org/genBankConversion#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:obo="http://purl.obolibrary.org/obo/">
  <prov:Activity rdf:about="https://synbiohub.org/public/igem/igem2sbol/1">
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <dcterms:description>Conversion of the iGEM parts registry to SBOL2.1</dcterms:description>
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/igem2sbol"/>
    <dcterms:title>iGEM to SBOL conversion</dcterms:title>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbol:displayId>igem2sbol</sbol:displayId>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <dc:creator>James Alastair McLaughlin</dc:creator>
    <prov:endedAtTime>2017-03-06T15:00:00+00:00</prov:endedAtTime>
    <dc:creator>Chris J. Myers</dc:creator>
    <sbol:version>1</sbol:version>
  </prov:Activity>
</rdf:RDF>

The validator says something is wrong with this line: <prov:endedAtTime>2017-03-06T15:00:00+00:00</prov:endedAtTime>. What should that look like? Let's download the same component from SynBioHub to see that. We'll use curl:

curl -o igem2sbol.xml https://synbiohub.org/public/igem/igem2sbol/1/sbol

That gives us the following file:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#dateTime/" xmlns:om="http://www.ontology-of-units-of-measure.org/resource/om-2/" xmlns:synbiohub="http://synbiohub.org#" xmlns:sbh="http://wiki.synbiohub.org/wiki/Terms/synbiohub#" xmlns:sybio="http://www.sybio.ncl.ac.uk#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:ncbi="http://www.ncbi.nlm.nih.gov#" xmlns:igem="http://wiki.synbiohub.org/wiki/Terms/igem#" xmlns:genbank="http://www.ncbi.nlm.nih.gov/genbank#" xmlns:gbconv="http://sbols.org/genBankConversion#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:obo="http://purl.obolibrary.org/obo/">
  <prov:Activity rdf:about="https://synbiohub.org/public/igem/igem2sbol/1">
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/igem2sbol"/>
    <sbol:displayId>igem2sbol</sbol:displayId>
    <sbol:version>1</sbol:version>
    <dcterms:title>iGEM to SBOL conversion</dcterms:title>
    <dcterms:description>Conversion of the iGEM parts registry to SBOL2.1</dcterms:description>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <dc:creator>Chris J. Myers</dc:creator>
    <dc:creator>James Alastair McLaughlin</dc:creator>
    <prov:endedAtTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2017-03-06T15:00:00.000Z</prov:endedAtTime>
  </prov:Activity>
</rdf:RDF>

Note the difference in the endedAtTime, which now includes rdf:datattype. Aha! Is that what's been missing? Let's upload this latest SBOL file, generated by SynBioHub, to the validator to see if it is correct. And yes, it is. So now can we do the same in pySBOL2?

Here's a new file generated by pySBOL2 with a fix to include an rdf:datatype:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#" xmlns:xsd1="http://www.w3.org/2001/XMLSchema#dateTime/" xmlns:om="http://www.ontology-of-units-of-measure.org/resource/om-2/" xmlns:synbiohub="http://synbiohub.org#" xmlns:sbh="http://wiki.synbiohub.org/wiki/Terms/synbiohub#" xmlns:sybio="http://www.sybio.ncl.ac.uk#" xmlns:ncbi="http://www.ncbi.nlm.nih.gov#" xmlns:igem="http://wiki.synbiohub.org/wiki/Terms/igem#" xmlns:genbank="http://www.ncbi.nlm.nih.gov/genbank#" xmlns:gbconv="http://sbols.org/genBankConversion#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:obo="http://purl.obolibrary.org/obo/">
  <prov:Activity rdf:about="https://synbiohub.org/public/igem/igem2sbol/1">
    <dc:creator>James Alastair McLaughlin</dc:creator>
    <sbol:persistentIdentity rdf:resource="https://synbiohub.org/public/igem/igem2sbol"/>
    <dc:creator>Chris J. Myers</dc:creator>
    <sbh:topLevel rdf:resource="https://synbiohub.org/public/igem/igem2sbol/1"/>
    <dcterms:title>iGEM to SBOL conversion</dcterms:title>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/myers"/>
    <sbol:displayId>igem2sbol</sbol:displayId>
    <dcterms:description>Conversion of the iGEM parts registry to SBOL2.1</dcterms:description>
    <prov:endedAtTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2017-03-06T15:00:00+00:00</prov:endedAtTime>
    <sbh:ownedBy rdf:resource="https://synbiohub.org/user/james"/>
    <sbol:version>1</sbol:version>
  </prov:Activity>
</rdf:RDF>

Note that the time is slightly different. The time is HH:MM:SS+OFFSET. It lacks the milliseconds and uses +00:00 instead of Z for the timezone. And this version DOES NOT validate.

I'll spare you the code dive into the validator, which in turn ends up in libSBOLj, and jump to the conclusion that libSBOLj uses a restrictive date/time parser that requires the milliseconds field.

Is that really a requirement? Following along from http://www.w3.org/2001/XMLSchema#dateTime, with a few stops and starts, I ended up at https://www.w3.org/TR/xmlschema11-2/#dateTime which says "dateTime uses the date/timeSevenPropertyModel, with no properties except timezoneOffset permitted to be absent." The seven properties are year, month, day, hour, minute, second, and timezone offset.

This is a question for @cjmyers - what are the acceptable formats for a dateTime in SBOL? The SBOL 2.3 specification, Section 12.7 "Annotating Entities with Date & Time" gives as an example 2016-03-16T20:12:00Z without milliseconds. If I plug that string into the Activity XML above I get a validation failure. If I modify to include milliseconds (2016-03-16T20:12:00.000Z) it validates successfully. In my opinion the dateTime parsing in libSBOLj is too restrictive by requiring milliseconds as part of the date time.

The use of rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime" appears to be optional with the validator. Using the following endedAtTime validates:

<prov:endedAtTime>2016-03-16T20:12:00.000Z</prov:endedAtTime>

@tcmitchell
Copy link
Collaborator

@IsabelMarleen if you add milliseconds to the endedAtTime in your generated file I think you'll find that it validates and that you can upload it to SynBioHub. If you need to be able to do the upload programmatically I can probably offer you a patch until the validation gets sorted out either by change to libSBOLj (to be more flexible) or a change to pySBOL2 (to include milliseconds).

To be specific, change this entry:

    <prov:endedAtTime>2017-03-06T15:00:00+00:00</prov:endedAtTime>

to this:

    <prov:endedAtTime>2017-03-06T15:00:00.000+00:00</prov:endedAtTime>

@cjmyers
Copy link

cjmyers commented Aug 11, 2020

This is the code in libSBOLj that throws this validation error:

DateTimeFormatter fmt = ISODateTimeFormat.dateTime();
try {
	endedAtTime = fmt.parseDateTime(((Literal<QName>) namedProperty.getValue()).getValue().toString());
} catch (IllegalArgumentException e) {
	throw new SBOLValidationException("sbol-12403", topLevel.getIdentity());
}

As you can see, we are using a library to parse the time. This library is likely what is being very strict about the DateTime format. It is defined here:

https://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormatter.html

In particular, I'm using this parser:

https://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTime--

It suggests that you can use a less strict one:

https://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTimeParser--

I could change to that one, if you sure it is okay.

@cjmyers
Copy link

cjmyers commented Aug 12, 2020

We discuss this issue today, and the big problem is that pySBOL 2 is not round-tripping files. This is a serious bug. While not requiring the milliseconds is fine, dropping them is not. We expect that if a library reads and writes an SBOL file without making any changes that the written file will compare with no differences to the original. With pySBOL2 dropping the datatype and milliseconds, this is not the case, and thus a bug with pySBOL2.

If you think that the file you are generating should be valid SBOL, then you can log an issue with libSBOLj and I can change to a more permissive parser for validation, but this is a separate issue.

Thanks for your help tracking this down.

@tcmitchell
Copy link
Collaborator

@IsabelMarleen you have a workaround and I think it is pretty clear at this point the issue is not related to composite part assembly. Is it ok to close this issue?

@tcmitchell
Copy link
Collaborator

Update: For #381 it looks like including the rdf:datatype in a dateTime violates the SBOL 2.3 specification, so I no longer anticipate a fix soon.

@cjmyers
Copy link

cjmyers commented Aug 12, 2020

Can you explain why you think it violates the specification?

@cjmyers
Copy link

cjmyers commented Aug 13, 2020

If we go with semantic comparison, would this parser do the trick?

https://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTimeParser--

@tcmitchell
Copy link
Collaborator

Can you explain why you think it violates the specification?

The explanation is on the linked issue, #381

@tcmitchell
Copy link
Collaborator

If we go with semantic comparison, would this parser do the trick?

https://www.joda.org/joda-time/apidocs/org/joda/time/format/ISODateTimeFormat.html#dateTimeParser--

It's hard to tell. The dateTimeParser says it will parse a date or a time or a date-time. It might cast too wide a net. Another option might be to try dateTimeNoMillis if dateTime fails to parse. I can't tell which approach is better or easier, and I don't have a Java environment within which to test.

@tcmitchell
Copy link
Collaborator

@IsabelMarleen are you set with this issue? If I don't hear back within a week I plan to close this issue.

@IsabelMarleen
Copy link
Author

@tcmitchell I think the two issues you created cover the problem I encountered. Thank you for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants