Project Management Procedures with Unix

This page describes how to manage your Oracc corpus through a Unix terminal from a PC or Mac. It takes you through uploading files to Oracc, checking the project, adding new entries to the glossaries, making corrections to texts that are already online, and rebuilding the project website.

Throughout these instructions, substitute proj for the name of your project (e.g, obstn, cams, hbtin) and subproj for any subprojects you are running. [LANGUAGE] stands for any one of the Oracc language codes for the ancient languages ini your project.

For more information on the oracc command, see the page on The Oracc Command.

Before you begin | Uploading files | Downloading files | Checking the corpus | Adding PSUs | Adding linguistic annotations | Harvesting | Merging the glossaries | Rebuilding | Correcting errors

Before you begin

Managing an Oracc corpus entails two types of communication with the Oracc server:

  1. Uploading files to Oracc; and
  2. Connecting to Oracc with a command-line (text-based) terminal programme to enable you to manage the files on the Oracc server.

To do this on a Mac, you will need to use the Terminal utility, which you will find in Applications/Utilities. You might find it useful to keep the Terminal in your Dock if you will be using it regularly. When the Terminal is open, hold your mouse down on its icon in the Dock (a black computer screen) and choose Keep in Dock.

On a PC, you will need to install a terminal utility such PuTTy [http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html].

When uploading files to Oracc, you may prefer to use a programme that gives a view of the local and remote folders and allows you to drag files from one to the other. If you are using a Mac, it is worth trying Cyberduck [http://cyberduck.en.softonic.com/mac] and for PCs there is WinSCP [http://winscp.net/eng/index.php].

Once you have installed the software you want, you need to secure your connection to the Oracc server. Generally speaking, you will only need to do this once.

Some useful Unix commands

There may be times when you need to move or delete project files on Oracc. Do this very carefully! The basic commands you need are:

cd
change directories, e.g., cd sources to move to the sources directory from your project's home directory; cd .. to move up a level in the directory hierarchy (for instance from sources to your home directory)
less
a pager to read text files, e.g., less oracc.log
ls
get a listing of the files and directories in the current directory
man
get documentation about a command, e.g., man ls
mkdir
create new directory, e.g., mkdir photos
mv
move files (copy and delete the original, effectively; good for when you need to rename files), e.g., mv mb12345.atf bm12345.atf
passwd
change the password for your project; you will be asked for the current one (once) and the new one (twice)
rm
remove or delete files -- use with caution!. E.g., rm bm12345.atf"
rmdir
delete empty directory (will not work if there are files in it), e.g., rmdir photos

Uploading ATF, ODS and catalogue files to Oracc

This stage is only necessary if your ATF and/or ODS files are not on Oracc already but are on your own computer and/or you have your own project catalogue. If you access your ATF files remotely from Oracc, and your project uses the CDLI catalogue [http://cdli.ucla.edu/], you can skip straight to Checking the Corpus.

Before you upload ATF or ODS files to Oracc, you must check that they are clean with the Checker webservice.

If you are using FuGu or WinSCP, you will need to enter the following information in the login dialogue box:

host: oracc.museum.upenn.edu
user name: proj
password: [your project's password]

Copy ATF files into your project's (or subproject's) 00atf/ folder and (if applicable) the XML file exported from your catalogue into your project's 00cat/ folder.

If you are comfortable using a command line interface such as Terminal (for Mac) or PuTTy (on a PC), then follow these instructions instead:

Downloading files from Oracc

For several steps in the project management process you will need to edit files. It is best to work with the file(s) on Oracc, even if you originally created them on your own computer, so that you can be confident that it is the latest version. To download files from Oracc follow these instructions:

If you are using FuGu or WinSCP, you will need to enter the following information in the login dialogue box:

host: oracc.museum.upenn.edu
user name: proj
password: [your project's password]

Copy ATF and/or ODS files from your project's 00atf folder onto your own computer. Copy [LANGUAGE].glo files from your project's 00lib folder.

If you are using Terminal (for Mac) or PuTTy (on a PC), then follow these instructions instead:

Checking the corpus

Now you need to check that all is will with your corpus as a whole.

You can now move to Adding new PSUs to the glossary or straight to Harvesting new lemmatisation data.

Adding PSUs to the glossary

New Phrasal Semantic Units such as lumun libbi [sorrow] N and karṣa akālu [slander] V have to be added to the glossary by hand. Here's how to do it. First you need to download and open the glossary in an editor if you haven't already got it open.

When you have finished and saved [LANGUAGE].glo (you will need to give the password when prompted), you need to check it, either by using the ATF processor [http://oracc.museum.upenn.edu/util/atfproc.html] or by uploading the file to the 00lib/ directory on Oracc and checking the corpus.

Correct any errors that are listed, save [LANGUAGE].glo, and check again until no errors are listed.

Adding linguistic annotations to the glossary

Optionally, you can manually add information ṭo the glossary about roots and dialects, and further explanations of guidewords.

Harvesting new lemmatisation data

This process collects together all the newly lemmatised data so that you can check it for errors and correct them before the big glossaries are rebuilt.

Now you need to check and correct the harvested lemmatisations.

Now you need to merge the new data with the existing glossaries.

Merging the glossaries

Only do this step when you are confident that all the lemmatisation data in the [LANGUAGE].new files is correct (see the section on Harvesting).

If you added new PSU data to a [LANGUAGE].glo earlier in the process, that involved words that were not yet in the glossaries, you need to finish that job. Otherwise, go on to rebuild the website.

Now you are ready to rebuild the website.

Rebuilding the corpus

This is the final step in putting edited material online.

Correcting errors in online data

Sometimes it's necessary to correct mistakes in, or or make improvements to, transliterations, translations, or glossary entries, or metadata that has already been published to the server.

Correcting errors in the metadata

If you see mistakes in the metadata displayed in the left-hand sidebar on your project's website:

Update the catalogue installation by rebuilding the corpus. If you also want to make changes to ATF or ODS files, continue to the following section without rebuilding yet.

Correcting errors in ATF or ODS files

When correcting errors in ATF or ODS files, it is best to work with the file(s) on Oracc, even if you originally created them on your own computer, so that you can be confident that it is the latest version.

If you are correcting a lemmatised file, you will also have to delete the incorrect lemmatisation from the relevant glossary entry in [LANGUAGE].glo. Following the harvest and merge routine will add the correct new lemmatisation to [LANGUAGE].glo.

When you are done, check the corpus, fix any errors, and rebuild the website.

23 Jul 2014 osc at oracc dot org

Eleanor Robson

Eleanor Robson, 'Project Management Procedures with Unix', Oracc: The Open Richly Annotated Cuneiform Corpus, Oracc, 2014 [http://oracc.museum.upenn.edu/doc/help/managingprojects/procedures/]

 
Back to top ^^
 

Released under a Creative Commons Attribution Share-Alike license 3.0, 2014. [http://www.facebook.com/opencuneiform] [http://oracc.blogspot.com] [http://www.twitter.com/oracctivity]
Oracc uses cookies only to collect Google Analytics data. Read more here; see the stats here [http://www.seethestats.com/site/oracc.museum.upenn.edu]; opt out here.

http://oracc.museum.upenn.edu/doc/help/managingprojects/procedures/