iA


Experiences of Uploading Latex to Connexions

by heather. Average Reading Time: almost 6 minutes.

Some time ago I started on the process of uploading all the updated versions of the Free High School Science Texts (FHSST) to Connexions. The books cover grades 10 – 12 maths and physical science for the South Africa school curriculum. With several hundred pages of content and 99 chapters the task seemed daunting. Mark Horner showed me the steps to take in the process and together we started on the longest chapter which we considered to be the most likely to give us the majority of problems that we would encounter. After a few attempts the chapter was still not uploaded but our process was much more streamlined and working better. So I then started on the rest of the chapters. After much effort I was successful at getting one chapter uploaded. Now, several weeks later we have the majority of the chapters uploaded. And just recently we published the Grade 10 Maths book as a collection on Connexions.

So why was this process so challenging? What were some of the many problems encountered?

The books were all written in Latex (a typesetting mark-up language). Latex has many different environments to do pretty much the same thing (e.g. mbox, rm, makebox) . The Connexions website only supports a few of these environments. Our books were all written by volunteers and each person has their own preferred way to do typesetting in Latex.

Also we created most of the images using PSTricks and this is not supported by Connexions. So the first step was to remove all of these pictures and replace them with the PNG or EPS version. Originally we uploaded both the PNG and the EPS files but then we discovered that the png files were modified during the import process and the resolution was quite poor. A better result was achieved when just including eps files and allowing Connexions to generate the png’s itself. A feature request (see here) has been submitted to address this as there are use cases where the eps for print and png for web could well look quite different (colour vs. b&w is a simple one). and the png’s were generated automatically. Connexions then generated the PNG files. However the script to do this had never being tested, fortunately it worked quite well and the images came out looking pretty good.

Rory Adams wrote a Python script to convert all the images to EPS and PNG. However the script needed to be tweaked as it did not work very well and kept cropping the images badly. After some tweaking and the addition of a Perl program that Mark Horner wrote the script ran much better and only a few images needed to be redone.

So armed with the Python script, the Latex files, a few Latex environments that definitely needed removing and lots of patience I set out to upload all the FHSST books to Connexions. After much hard work I had made an attempt to get all the Grade 10 chapters up. With a minimal success rate. Obviously something else was still being a problem. But I persevered and got through most of the Grade 11 chapters with a better success rate. Around the time I started on the Grade 12 chapters Mark went to the Plone conference and while there he tweaked the process even more and made a much more streamlined version. Using this I was able to get the Grade 12 chapters up much faster and with greater success. And even greatly improved the success rate for getting the Grade 10 chapters up. Mark also set up a copy of the Connexions server in the OIS with all the required stuff on it. This was an attempt to get the uploading and error removing to go faster since it reduced the waiting time for Connexions to load and work out what changes had been made. Sadly there was a bug in the image generation and with Mark away at the SF fellows gathering this could not be fixed (I do not yet have the skills needed to fix this kind of error, but one day…) and so the faster process was abandoned. But I still managed to get through the chapters and by the time Mark returned the majority of the chapters were up with only a few still giving unusual errors.

What was the process and how did it evolve as time went by?

Originally we did:

The first step was to copy the chapter file into a working folder, then open a terminal and vi into the chapter file. Once there all the PSPicture parts of the code were checked to ensure they were all on one line. Pretty quickly we realised we also needed to check for PSPicture code with a star in it. Next we checked for worked examples that did not have worked examples steps and corrected these. Then the python script was run to convert the images. A new chapter file was created and once again vi into that. Check for and remove all mboxes. Also check for and remove scaleboxes. Then run it through tralics and kile (a front-end for Latex). If all that checked out uploading was attempted. If that worked we downloaded the CNXML file and ran it through another script to remove code that had been added to deal with headings and special environments. Then finally upload the new CNXML file and hope there were no errors.

After much work we realised that this was not getting all the problem environments and so we came up with a much more streamlined process that took into account carriage returns, blank lines in funny places and all the previous problems.

Mark wrote a command for vi that fixed all the PSPictures, scaleboxes and blank lines with one command instead of several. So once again the chapter file was copied into the working folder and opened in vi. Now we could just run FHSSTClean and remove several problems at once. Next the worked examples were checked and then the python script was run. Then vi into the new latex file and remove mboxes. After that check if tralics and kile are happy with the file and then attempt uploading. After successful uploading download the CNXML file, remove the code that was added and upload the new CNXML file. Fix any remaining errors and publish. Happiness.

However we still do not have an easy fix for multirows in tables. Also replacing mbox with rm causes problems. But once all these have been worked through most chapters upload.

What are the Latex environments to avoid if I want to upload to Connexions?

Mbox, array, multirows in tables, scaleboxes, pspictures, be careful with blank lines and carriage returns, special user-generated environments, special commands (\frownie, \smiley, \diameter, \checkmark, \framebox, \margincompass, \makebox), anything that requires unusual packages, tabbing commands, spacing commands (excepting: \quad, \qquad, `, ~), user-defined commands (e.g. \ms for metres per second). And probably many others. These are just the ones that I have encountered in the process.

read original post on heather's Site