July 30, 2012

Techbits #19: How can an XSD be constructed from an XML?

You have an XML file but do not have its schema definition (XSD). Now, you would like to get your hands on one of its possible XSDs. What do you do?

Helpful information on how to generate XSD from XML can be found at http://www.dotkam.com/2008/05/28/generate-xsd-from-xml/. This would be useful in getting you started when you encounter any necessity to build and provide schema definitions for XML files.

Specifically, "trang" is the open source utility, located at http://www.thaiopensource.com/relaxng/trang.html, that does the job fairly good enough. Note that there're other utilities as well but many of them are commercial.

Note that there are tools available online that require you to upload an XML file, which will then be processed after which its XSD can be downloaded. However, this approach poses security risks and cannot be used in the case of files carrying sensitive content. This becomes all the more critical in the case of any files related to your work as it could become a corporate violation. As such, this word of caution of not using online tools for official work has to be kept in mind at all times.

July 21, 2012

Techbits #18: Editing ClearCase files from Windows leads to ^M characters?

A common problem reported by developers accessing ClearCase VOBs/views from Windows is that the edited text files (like source code, scripts, configuration files etc.) contain Ctrl+M (^M) characters when they are checked-in eventually (or) when transferred to Unix-based servers like Solaris or Linux. This could lead to further build problems.

Most people try to get around this issue by replacing the additional Ctrl+M characters in the files on the Unix machines by using commands like "dos2unix" or "sed 's/^M$//g' ...". However, the issue has to be fixed at the source (editing in the first place) and not at the destination.

How to then fix the problem? By setting the file type properly in your text editor and ensuring that the file type is retained correctly in case of existing file edits. If your text editor (like Notepad or Wordpad) does not support setting these properties, you should consider switching to something else.

If you're using Notepad++, then this can be specified by following Settings >> Preferences >> New Document/Default Directory and setting the parameter "Format" to "Unix". Note that Notepad++ needs to be restarted to effectuate the change. Once restarted, you can verify by seeing UNIX in the status bar. In a similar manner, if you’re editing any existing UNIX-based text files, the status bar will show UNIX.

Similar options can be found in standard IDEs like Eclipse and others.