Extract still frame from a video
ffmpeg -y -i in.avi -vframes 1 -ss 00:00:36 -an -vcodec png -f rawvideo -s 640x480 out.png
The ffmpeg documentation provides the following, but this doesn't work with the current version of ffmpeg on Ubuntu:
ffmpeg -i in.avi -vf thumbnail,scale=300:200 -frames:v 1 out.png
Rename images by timestamp
exiftool "-FileName<CreateDate" -d "%Y%m%d_%H%M%S.%%e" ./
Create a gallery of images
Generate captions:
llgal -d <DIRECTORY> --gc --cc --cf --ct
Create gallery:
llgal -d <DIRECTORY> -n --sy 480 --exif Aperture,FocalLength,ImageHeight,ImageWidth,FileModifyDate --nf
For example:
Or on a single line:
llgal --gc --cc --cf --ct; llgal -n --sy 480 --exif Aperture,FocalLength,ImageHeight,ImageWidth,FileModifyDate --nf; zip -r gallery . -i \*.html; zip -r gallery . -i .llgal; rm -r .llgal; rm *.html
Called from a directory containing images, this will create a zip archive containing the gallery. This archive can be unzipped on the server as required (using our modified AttachFile action, see MoinMoinCustomizations); this way we have:
- A collection of images that can be downloaded as a single zip archive;
- No dependency on Python Imaging Library (unavailable in our shared hosting environment!);
- No caching issues (Arnica takes a long time to generate thumbnails, which expire after a time and need to be regenerated; they also need to be created locally after a sync, which may cause timeouts and/or impatience on first page view).
Word count (html document)
lynx -dump -nolist filename.html | wc -w
Batch find-and-replace
perl -pi -w -e 's/find/replacement/g;' *.html
perl -pi -w -e 's/([A-ZА-ЯӨҮ][a-zа-яөү]+)([0-9]{4})([A-ZА-ЯӨҮ][a-zа-яөү]+)/$1, \"$3\", $2/g;' test.moin
Graphviz diagram
dot -Tpng -ofilename.png filename.dot
Concatenate files
(e.g., to create a bibliography out of references in a directory)
cat *.txt >> total.html
perl htmlcat -s -o bibliography.html *.html
This latter script generates a single bibliography file.
To concatenate text files in a directory into an index with filenames and line numbers:
grep -n '.*' *.txt > index.txt
Create encrypted files
In order to protect the privacy of research participants in uploading ethnographic documents, we have blanked out personally identifiable data in interview transcripts and other records.
This is done as follows:
- Mark sections of text containing personally identifiable information in curly brackets. For example, the text "My name is Otgonbayar and I was born in Sainshand" would be marked "My name is {Otgonbayar} and I was born in {Sainshand}".
Rename other (non-text) files, or text files for which there will be no anonymized equivalent, to use the prefix RESTRICTED..
- Run one of the commands below to obfuscate personal information and create encrypted versions of the original files.
There are two options: (1) create directly encrypted files, or (2) create an encrypted 7z archive. The first option has the advantage of making encrypted files individually visible and downloadable, each with its own password; this allows us to provide access on a per-file basis, and doesn't require downloading an entire (potentially large) folder in order to gain access to a single document. On the other hand it is not quite as user-friendly as a password-protected zip file--particularly for non-technical / Windows users--since it requires installing an unfamiliar program.
By analogy: a compressed, password-protected zip folder is something like keeping restricted documents in a locked box inside a folder in a filing cabinet that anyone might browse through, possibly alongside non-restricted documents stored loosely in the folder; using mcrypt or gpg to encrypt individual files is like keeping loose documents in cypher. It is much easier for the casual user to unlock a box containing many documents, having been provided the key, than it is to decrypt individual documents.
(1) Directly encrypted files
1 perl -piRESTRICTED.* -w -e 's/\{.*?\}/########/g;' *.txt; FILENO=`ls RESTRICTED* | wc -l`; for (( c=1; c<=$FILENO; c++ )); do echo `tr -dc [:alnum:]- < /dev/urandom | head -c 24` >> KEYS; done; mcrypt -u -f KEYS -a rijndael-256 RESTRICTED.*; mcrypt -u -a rijndael-256 KEYS
decrypt using mcrypt -d KEYS; mcrypt -d -f KEYS *.nc
These commands create:
- Non-encrypted, anonymized versions of the text files, with personally identifiable data blanked out;
An encrypted KEYS file containing a separate, randomly-generated key for each of the original files;
Encrypted versions of the original files (labelled "RESTRICTED.*.*.nc"), using the keys from the KEYS file.
The above commands will ask for a password for the KEYS file, which is intended to be a universal keyword that is known to the system administrator only. Losing this password will make everything else non-recoverable!
For security purposes we may want to keep the KEYS file in a secure location. It is possible for this file to be kept in this directory within an offline moin repository, so long as it is not accidentally synchronized with other instances, or so long as it is itself encrypted using a unique key that is stored somewhere else.
Note that the order of the keys follows the order of the files in the directory. This works fine so long as we are using basic alphanumeric characters, and we don't add new files to the directory that will change the sort order (in which case it might be safest to decrypt and re-encrypt everything).
(2) 7Zip archive
We will use 7zip since it makes use of the cryptographically secure AES algorithm, unlike the simple encryption in regular zip files which is vulnerable to dictionary attack. The 7zip utility is available for Windows, linux, and other platforms.
Note that we might be able to integrate 7z functionality into MoinMoin, such that the user is able to view the document directly with an appropriate password. This is potentially useful for streaming videos in particular, though not necessarily secure unless we make use of SSL.
7za a RESTRICTED_ITEMS.7z RESTRICTED.* -pPASSWORD
So:
1 perl -piRESTRICTED.* -w -e 's/\{.*?\}/########/g;' *.txt; KEY=`tr -dc [:alnum:]- < /dev/urandom | head -c 24`; DIR=`pwd`; echo "$KEY" > ~/KEYS/`expr "$DIR" : '.*\/pages\/\(.*\)\/attachments'`; 7za a RESTRICTED_ITEMS.7z RESTRICTED.* -p$KEY; rm RESTRICTED.*; 7za l RESTRICTED_ITEMS.7z > RESTRICTED_ITEMS.txt;
The key is stored in the KEYS subdirectory of the user's home directory (must be created in advance), in a file named for the base page in the wiki. This works if we are running the script from within the attachments directory of a wiki page.
To write an encrypted key file to the current directory (which can be encrypted/decrypted without user input using the key from ~/.mcryptrc):
1 perl -piRESTRICTED.* -w -e 's/\{.*?\}/########/g;' *.txt; KEY=`tr -dc [:alnum:]- < /dev/urandom | head -c 24`; DIR=`pwd`; echo "$KEY" > KEY; 7za a RESTRICTED_ITEMS.7z RESTRICTED.* -p$KEY; rm RESTRICTED.*; 7za l RESTRICTED_ITEMS.7z > RESTRICTED_ITEMS.txt; mcrypt -u -a rijndael-256 KEY;
Or to generate a separate key for each item:
1 perl -piRESTRICTED.* -w -e 's/\{.*?\}/########/g;' *.txt; for f in RESTRICTED.*; do KEY=`tr -dc [:alnum:]- < /dev/urandom | head -c 24`; echo "$KEY $f" >> KEYS; 7za a RESTRICTED_ITEMS.7z $f -p$KEY; done; rm RESTRICTED.*; 7za l RESTRICTED_ITEMS.7z > RESTRICTED_ITEMS.txt; mcrypt -u -a rijndael-256 KEYS;
Note that having separate keys for all the items in the 7z file is liable to be quite inconvenient for end users; where possible I would prefer to set permissions on a folder level.
This will generate:
RESTRICTED_ITEMS.7z - a document containing password-protected files
KEY.nc or KEYS.nc - the key(s) for the 7z archive (store safely!)
RESTRICTED_ITEMS.txt - list of items in the restricted file
The moin attachments browser won't show the file names, so they should be listed in the catalogue entry, or at least an indication of the index should be provided.
Link checking
Find orphans (e.g., extraneous bibliographic reference files), etc. This makes a nice set of reports that can subsequently be called from our chrome javascript. If we want something a little nicer we can add anchors and css...
linklint -doc linklint -htmlonly -docbase .. /@
Other programs
- W3C slidemaker (modified script)
- site / directory contents lister
- indexer (i.e., list of pages and "backlinks")
- blog redirect script (redirects to the most recent document in a directory, where each "blog entry" is a separate html file)
- chrome
- bibliography maker (from references in a given document)
- reference maker (splits references from a full bibliography)
- rss maker (produces rss feed from set of html documents; send to feedburner or other cache!)
- image viewer (though would it make more sense to turn this into a series of compressed files?)
- Mongolian transliteration / tooltips producer
- kml generator: works from a csv index of sites and descriptions
- etc.
External programs
- reamweaver / dada engine: ...
- blogtorrent: not my program, but worth knowing about
Desktop utilities
- Amaya
- Mercurial (can we handle this through scripts rather than through Tortoise?)
- gThumb
the video editor (avidemux?) -> try OpenShot. The main thing here is to have watchable videos; DVD quality is probably too high since we don't have the means yet to transmit huge amounts of video; VCD quality is acceptable since we can get about 8 hours of video onto one DVD. I think ideally 800x600 resolution at 30 fps would be nice--DVD is 720x480--but it means we only get around an hour of video on a single disc. And storing the raw video footage can become expensive as well, since we are looking at terabytes of storage: 100 hours of video as 470 GB! I don't think this is really necessary.

![[?]](/web.cgi/moin_static193/mandal/img/moin-help.png)