search

Kamis, 14 Juli 2011

Practical Tips on using Google Apps Scripts for Chemistry Applications

A few weeks ago I described our use of Google Apps Scripts, developed by Rich Apodaca and Andrew Lang, as an intuitive interface to information related to a chemistry laboratory notebook. Since then we have been using these tools to actively plan and record experiments (e.g. UC-EXP269) and we have learned their strengths and weaknesses.

The most problematic aspect of Google Apps Scripts running within Google Spreadsheets turns out to be the way caching and refreshing operate. There does not appear to be an obvious way to refresh a single cell. So if a script times out or fails, Google stores that failed output on their servers and will not run it again until some time has elapsed (which seems to be on the order of about an hour). Typing in a new input for that cell will cause the script to run again but entering a previously entered input will only retrieve the cached output, even a failed output. For example, if you have a cell calculating the MW from "benzene" entered in another cell and the script fails for any reason, typing in "ethanol" will get it to run again for the new input, but going back to "benzene" will just pull up the cached output of "Failed".

Nevertheless, I did come across some tricks to force a refresh indirectly. If you insert a row or column then re-enter the desired scripts in the new cells, they will run again. You simply need to then delete the old column with failed outputs. This is fine for simple sheets but it can be a headache for sheets that have several calculation dependencies between cells.

To avoid these complications, simply refresh the entire sheet by duplicating it, deleting the old sheet and then renaming the new one to the original name. The problem now is that it will refresh all the cells, not just those that had failed outputs. And if there are a large number of scripts on that sheet the odds are good that at least one will fail on that particular attempt, especially if several are hitting the same web server.

As a result of all these problems, I would not recommend using these services as I had initially hoped, where a researcher would enter data into a template sheet loaded with scripts to automatically generate a series of calculated outputs. There is a way to achieve this end but it requires thinking about the scripts in a slightly different way.

As I mentioned above, there are tricks for refreshing an entire sheet or a column or row. In order to avoid re-running the scripts that already returned desired outputs, we need to lock them in. This can be done by highlighting the completed cells, copying them (either control-c or Edit->Copy) then pasting them as values (from the Edit menu). Now refreshing will only be done on the cells with failed outputs and these can be locked in as well as soon as they complete.

The downside of this approach is that you lose the information about which script was run to generate the output values. And to change an input requires re-selecting the desired script. But in practice it is so convenient to hit a dropdown menu and hit getMW (for example) that this downside is quite minimal, especially when contrasted with the upside of knowing that others will see your information reliably, independent of how the services are running at a particular time.

Over the past few weeks we have found that some services fail more often than others and it would be advantageous to have some redundancies. This has been particularly problematic for the cactus services recently, which we often use for resolving common names. By using ChemSpiderIDs (CSIDs), the cactus services can be bypassed for several of the gONS services. So a good practice for any application is to generate and lock in SMILES and CSIDs right away from the common name. CAS numbers can be used too but the gChem service that Rich has created sometimes yields multiple CAS numbers and these will fail as input for a subsequent script.

We now have a chemistry Google Apps Scripts spreadsheet to keep track of which inputs are allowed for all the available services, along with information about the output, creator and description. We also keep track of requests and plans for new scripts, marked as "pending" under the status field.


Surprisingly, pasting images "as values" within a Google Spreadsheet cell does not ensure that they will appear consistently - often the cells are just blank upon loading. This makes the idea of using an embedded sheet to display reaction schemes within a wiki lab notebook page not practical. However, using the scripts and a template to generate the scheme by just typing the name, SMILES or CSID for the reactants and product is a very efficient way to generate a consistent look for schemes within a notebook. It only requires a final step of taking the image of the screen and cropping using Paint. For example, here is a scheme thus generated for UC-EXP269.


Taking into account all of these factors, the reaction template sheet we provide does not have by default any scripts running within cells (except for the images). However, it is set up to quickly adapt to other reactions for planning amounts of reactants (by weight or volume), calculating concentrations, yields, melting points (experimental and predicted), solubilities, links to ChemSpider, 2D rendering of structures (including full schemes) and links to interactive NMR spectra using ChemDoodle. It simply requires users to hit one of the 3 drop-down menus (gChem, gCDK or gONS) and select the appropriate script for a particular cell.

Even if the user does not want to use this particular reaction template it still makes sense to make a copy of the template sheet because it is an easy way to copy all of the necessary Google Script without opening the editor.

Tidak ada komentar:

Posting Komentar