Removing Ref from XML Schema
I have a large XML Schema (18,500+ Lines) that I am trying to import into Oracle's XML DB, and I am running into the problem that there are about 10,000 refs declared in the schema. The problem with Refs is that XML DB does not know what these columns are pointing at, so instead of making them columns in the table based on their datatype they make a new table using a large object as the only column. This means that if I run the schema as is, I will get some 200 tables just for the structure of the schema, then an extra 10,000 tables to accomidate all of the refs, some of which are pointing at a simple Boolean datatype. Is there some tool out there, or does anyone have any suggestions on the easiest way to convert refs in an XML Schema to types? I really dont care how it does it, as long as it gets rid of the refs.
Thank you,
Tim
# 1 Re: Removing Ref from XML Schema
It is possible, (although not simple) to convert many of the refs to elements using xslt. A few ideas:
I'm assuming you have something like
<xsd:element ref="gender" minOccurs="0"/>
and then elsewhere, it's defined as
<xsd:element name="gender" type="genderType"/>
and
<xsd:simpleType name="genderType" type="xsd:string"/>
For each type, you could recursively search for it's definition. If you found that it reduces to one of the basetypes, you coule replace the type attribute eg. xsd:string, xsd:date, etc..
eg.
<xsl:template match="xsd:element">
<xsl:choose>
<xsl:when test="@type">
test if it's a simple type,
otherwise find the node or type and copy it's definition here instead.
</xsl:when>
<xsl:otherwise>
<xsl:element name="{name()}">
<xsl:copy-of select="@*"/>
<xsl:apply-templates />
</xsl:element>
</xsl:otherwise>
</xsl:template>
Dangers with this method, would have to check for circular references.
You could import other definitions by using the document function. Eg. search for the <xsd:import> elements at the beginning of a document, and then import them using the document function.
sample xslt in the main template (context is xsl:schema root):
<xsl:variable name="imports">
<xsl:for-each "xsd:import">
<import>
<xsl:copy-of select="document(@schemaLocation)"/>
</import>
</xsl:for-each>
</xsl:variable>
Would really need more information. namespaces, structure, etc. to help. It is not a simple task.
# 2 Re: Removing Ref from XML Schema
Youre on fire. 8)
As you might have guessed this was the first post for the problem you answered in the other thread. I posted it on 3 different forums and I got 1 response by noon yesterday which was pretty much, "Use XSLT".
Looks like you hit it on the nose, and I think the other stuff you gave me should be enough to make it work. The one thing that makes it easier is that I dont care about getting to the base type to insert into the original type. In otherwords, the:
<xsl:element ref="gender" minOccurs="0"/>
tag could be simply replaced with:
<xsl:element name="gender" type="genderType" minOccurs="0"/>
and I would not have to touch either of the other tags. That seems to be the best way to go, as the global tags are often referenced by multiple sources, so modifying them when we were done converting would affect all other areas that point there.
Fortunately, there are no imports in the schema I am using, it is all local to the schema, so I dont have to worry about pulling in external data. Thank you for bringing it up, though, I hadnt even thought about that, and its nice to have the code to do that if it ever comes up.
Again, thanks for the responses, it should make this much easier.
I do have another question though;
I am going to have to merge a lot of elements together. For example, I could have the ref element declaring the maxOccurs while the type element is declaring some other attribute I need, or vice versa. I need to get the greatest subset of the two. For example, I could have:
<xsl:element ref="gender" maxOccurs="1"/>
and later
<xsl:element name="gender" type="genderType" minOccurs="1"/>
Now, I realize that maxOccurs="1" and minOccurs="1" is redundant as they are the defaults, but its just to show the example. The end result I would want is the first element looking like:
<xsl:element name="gender" type="genderType" maxOccurs="1" minOccurs="1"/>
Any suggestions on that? I was thinking of just taking the referenced element, and copying all of the attributes from it, then going to the original element and copying all but the "ref" attribute, but what would happen if there was the same attribute in both tags? Is that even possible? Will it allow a reference to specify a particular attribute that is also present in the element it references?
Thanks,
Tim
# 3 Re: Removing Ref from XML Schema
I think I got it, can't get it to work with namespaces properly though; Basically, the namespace prefix, (probably xs:, or xsd:) has to be hard coded in. You also lose the default namespace if it's different than the standard one.
bas won't let me upload it as an xslt file.. oh and comments <!-- -->are also filtered out.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsl:key name="elmap" match="xsd:element" use="@name"/>
<xsl:template match="/xsd:schema">
<xsd:schema>
<xsl:apply-templates/>
</xsd:schema>
</xsl:template>
<xsl:template match="xsd:element">
<xsl:choose>
<xsl:when test="@ref">
<xsl:variable name="match" select="key('elmap', @ref)"/>
<xsl:choose>
<xsl:when test="$match/*">
<!-- if more children, leave as is -->
<xsl:copy-of select="."/>
</xsl:when>
<xsl:otherwise>
<xsl:element name="{name()}">
<xsl:copy-of select="@*[name() != 'ref']"/>
<xsl:variable name="ele" select="."/>
<xsl:for-each select="$match/@*">
<xsl:variable name="name" select="name()"/>
<xsl:if test="not($ele/@*[name() = $name])">
<xsl:copy />
</xsl:if>
</xsl:for-each>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:element name="{name()}">
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{name()}">
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
# 4 Re: Removing Ref from XML Schema
Basically, the namespace prefix, (probably xs:, or xsd:) has to be hard coded in.
No no, don't do that, to set the namespace on an element, created with xsl:element, you must set the namespace attribute on the xsl:element element, to the namespace URI. Like this
<xsl:element name="{name()}" namespace="{namespace-uri()}">
This copies both name and namespace from the current element.
khp at 2007-11-10 3:30:57 >

# 5 Re: Removing Ref from XML Schema
Wow. You didnt have to go and write the whole transformation for me. 8)
I just finished my first try at the transformation and was running into errors like that namespace thing you were talking about. The approach I was taking was to treat the schemas as simple XML files and doctoring the root nodes if I needed to to make it work, but it looks like there is a better way. The way I was going to get around the namespace was with a replace on xsd: to xsd_, but khp's way sounds better. Thank you so much for all the help, I will be checking this out on Monday. Looks really cool though.
Thanks again,
Tim
# 6 Re: Removing Ref from XML Schema
To work with namespaces in xsl, the trick is, not to worry about the namespace declerations ie. 'xmlne:xsd="whatever"'. And namespace prefixes ie. 'xsd:whatever'. The xsl engine will take care of these things.
All you need to do is, tell it what namespace an output node should belong to. This can be done either by setting the namespace attribute on an xsl:element node. or if you are creating a static node, by using a prefix that has been declared in the stylesheet. for example simply writing <xsd:element> directly in the stylesheet.
Or you can set the default namespace on the stylesheet, by simply setting the attribute xmlns="whatever".
khp at 2007-11-10 3:32:52 >

# 7 Re: Removing Ref from XML Schema
Well, jkmyoung, your transformation worked perfectly, and got my script 90% transformed with no changes. I then started working on it to make it do the rest of what I need it to do, and I have hit a little snag (I have finished it by hand, but I dont want to have to do this everytime I rerun the transformation).
The scenario is the part you accomidated for in the transformation with the comment "if more children, leave as is". The problem is that I cant leave as is. 8(
The best I could come up with is changing the "ref" to a "name" then adding the "type" attribute with a value I can search on (I used "#transform#", knowing it would not appear anywhere in the schema. Then when the transformation was completed, I do a search on the document to find these strings, and changed them to the name of the type, then go down to the referenced element and change it to a complex type with that name. Its is some 31 reference elements and about 15 referenced elements that have to be changed, so its not a big deal, but if I can automate it, that would be better.
Do either of you have any suggestions on that? Is it possible to do what I am wanting to do? This is low priority, btw, as I have everything I need to move onto the next step, I just would like to get this part worked out as well.
Thanks again,
Tim
# 8 Re: Removing Ref from XML Schema
I have another 2 part question.
1) I need to add a namespace to the output schema:
xmlns:xdb="http://xmlns.oracle.com/xdb"
This is for the Oracle XML DB elements and attributes. I tried using <xsl:attribute> to create it and it gave me an error on the "xmlns:xdb". I am assuming it doesnt like me adding a namespace to the schema. This is the code I was using:
<xsl:template match="/xsd:schema">
<xsd:schema>
<xsl:attribute name="xmlns:xdb">http://xmlns.oracle.com/xdb</xsl:attribute>
<xsl:attribute name=" xdb:storeVarrayAsTable">true</xsl:attribute>
<xsl:apply-templates/>
</xsd:schema>
Note I also need the attribute:
xdb:storeVarrayAsTable="true"
in the schema. It currently complains about "xdb:" saying it is not a valid namespace. Again, I can use the "xdb_" work around, but it would be nice to know the correct way to do this.
I looked at <xsl:namespace-alias> and that does not look like what I am needing. I still need the xsd: namespace, I simply need to add a namespace. Any help would be cool.
2) The second part of the question is basically the "xdb:" question again. If I add the namespace at the top of the schema with whatever syntax I am supposed to use for that, will later references to "xdb:" succeed? Specifically, I need to add an attribute to each <xsd:complexType> of "xdb:SQLType". I can do a search and replace on SQLType to xdb:SQLType, but again I want to do it the right way.
Thanks,
Tim
# 9 Re: Removing Ref from XML Schema
I tried using <xsl:attribute> to create it and it gave me an error on the "xmlns:xdb". I am assuming it doesnt like me adding a namespace to the schema.
You can't do it what way.
Did you not read my message above (Comment #7) ?
I'll try to be more clear...
To get the namespace set correctly in the output document, you should just declare the namespace on the stylesheet node, and then declare any output element to belong to that namespace. The xsl engine, will then place the xmlns:xsb decleration where it sees fit.
Like this...
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xdb="http://xmlns.oracle.com/xdb">
<xsl:template match="/xsd:schema">
<xsd:schema>
<xsl:attribute name="storeVarrayAsTable" namespace="http://xmlns.oracle.com/xdb">
<xsl:text>true</xsl:text>
</xsl:attribute>
</xsd:schema>
</xsl:template>
</xsl:stylesheet>
I looked at <xsl:namespace-alias> and that does not look like what I am needing.
No, namespace-alias is for writing stylesheets that produce stylesheets.
khp at 2007-11-10 3:36:02 >

# 10 Re: Removing Ref from XML Schema
I see, I was confused as to where to declare the namespace. Thanks, that clears it up.
Now, it is working perfectly for the "storeVarrayAsTable" attribute, but not for the "SQLType" attribute. I have the following code for the schema declaration:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xdb="http://xmlns.oracle.com/xdb">
<xsl:key name="elmap" match="xsd:element" use="@name"/>
<xsl:key name="grpmap" match="xsd:group" use="@name"/>
<xsl:template match="/xsd:schema">
<xsd:schema>
<xsl:attribute name="xdb:storeVarrayAsTable" namespace="http://xmlns.oracle.com/xdb">true</xsl:attribute>
<xsl:apply-templates/>
</xsd:schema>
</xsl:template>
...
And that output the schema correctly as follows:
<xsd:schema xdb:storeVarrayAsTable="true" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xdb="http://xmlns.oracle.com/xdb">
However, later I build the SQLType attribute using the same method:
<xsl:template match="xsd:complexType">
<xsl:element name="{name()}">
<xsl:copy-of select="@*"/>
<xsl:if test="@name">
<xsl:attribute name="SQLType" namespace="http://xmlns.oracle.com/xdb"><xsl:value-of select="substring(concat('TY_',translate(translate(@name,'abcdefghijklmnopqrstuvwxyz','ABCDEFGHIJKLMNOPQRSTUVWXYZ'),'_','')),1,30)"/></xsl:attribute>
</xsl:if>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
And I get the output:
<xsd:complexType name="TestElement" auto-ns1:SQLType="TY_TESTELEMENT" xmlns:auto-ns1="http://xmlns.oracle.com/xdb">
Is there special steps I have to take to use this later in the schema? Am I using it wrong the second time?
Thanks,
Tim
# 11 Re: Removing Ref from XML Schema
I'am sorry I consider this to be quite humorous ;)
<xsd:complexType name="TestElement" auto-ns1:SQLType="TY_TESTELEMENT" xmlns:auto-ns1="http://xmlns.oracle.com/xdb">
I guess from a technical standpoint this is perfectly correct and legitimate code, if not very pretty. XML wise it doesn't really matter if it says xdb:whatever or auto-ns1:whatever, as long as xdb and auto-ns1 refer to the same namespace URI.
I'am guessing that the xsl engine you are using is not quite as clever, at handling namespaces, as the one I use (Xalan-J). If I ask Xalan to run the same transform, I get a much nicer looking <xsd:complexType name="TestElement" xdb:SQLType="TY_TESTELEMENT">
It is perhaps worth noting, that in the case of the 'storeVarrayAsTable' attribute, you write name="xdb:storeVarrayAsTable" rather than name="storeVarrayAsTable", as I suggested, this acctually produces something less pretty in Xalan, but perhaps if you wrote name="xdb:SQLType" for your SQLType attribute, you would get what you want.
khp at 2007-11-10 3:38:04 >

# 12 Re: Removing Ref from XML Schema
That's funny. I didnt realize I had the alias in the storeVarrayAsTable attribute, but I added it to the SQLType and it worked. I am using XMLSpy on the Microsoft parser. Interesting.
Again, thanks for the help,
Tim