| Hlavní stránka > Admin Area > Admin HOWTOs > HOWTO Manage Fulltext Files |
Usage: /opt/invenio/bin/bibdocfile [options]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-D, --debug
-H, --human-readable print sizes in human readable format (e.g., 1KB 234MB
2GB)
Query options:
-r RECIDS, --recids=RECIDS
matches records by recids, e.g.: --recids=1-3,5-7
-d DOCIDS, --docids=DOCIDS
matches documents by docids, e.g.: --docids=1-3,5-7
-a, --all Select all the records
--with-deleted-recs=yes/no/only
'Yes' to also match deleted records, 'no' to exclude
them, 'only' to match only deleted ones
--with-deleted-docs=yes/no/only
'Yes' to also match deleted documents, 'no' to exclude
them, 'only' to match only deleted ones (e.g. for
undeletion)
--with-empty-recs=yes/no/only
'Yes' to also match records without attached
documents, 'no' to exclude them, 'only' to consider
only such records (e.g. for statistics)
--with-empty-docs=yes/no/only
'Yes' to also match documents without attached files,
'no' to exclude them, 'only' to consider only such
documents (e.g. for sanity checking)
--with-record-modification-date=date1,date2
matches records modified date1 and date2; dates can be
expressed relatively, e.g.:"-5m,2030-2-23 04:40" #
matches records modified since 5 minutes ago until the
2030...
--with-record-creation-date=date1,date2
matches records created between date1 and date2; dates
can be expressed relatively
--with-document-modification-date=date1,date2
matches documents modified between date1 and date2;
dates can be expressed relatively
--with-document-creation-date=date1,date2
matches documents created between date1 and date2;
dates can be expressed relatively
--url=URL matches the document referred by the URL, e.g. "http:/
/pcsk.cern.ch/record/1/files/foobar.pdf?version=2"
--path=PATH matches the document referred by the internal
filesystem path, e.g. /opt/cds-
dev/var/data/files/g0/1/foobar.pdf\;1
--with-docname=DOCNAME
matches documents with the given docname (accept
wildcards)
--with-doctype=DOCTYPE
matches documents with the given doctype
-p PATTERN, --pattern=PATTERN
matches records by pattern
-c COLLECTION, --collection=COLLECTION
matches records by collection
--force force an action even when it's not necessary e.g.
textify on an already textified bibdoc.
Actions for getting information:
--get-info print all the informations about the matched
record/documents
--get-disk-usage print disk usage statistics of the matched documents
--get-history print the matched documents history
Actions for setting information:
--set-doctype=doctype
specify the new doctype
--set-description=description
specify a description
--set-comment=comment
specify a comment
--set-restriction=restriction
specify a restriction tag
--set-docname=docname
specifies a new docname for renaming
--unset-comment remove any comment
--unset-descriptions
remove any description
--unset-restrictions
remove any restriction
--hide hides matched documents and revisions
--unhide hides matched documents and revisions
Action for revising content:
--append=PATH/URL specify the URL/path of the file that will appended to
the bibdoc
--revise=PATH/URL specify the URL/path of the file that will revise the
bibdoc
--revert reverts a document to the specified version
--delete soft-delete the matched documents (applies to all
revisions and formats)
--hard-delete hard-delete the matched documents (applies to matched
revisions and formats)
--undelete undelete previosuly soft-deleted documents (applies to
all revisions and formats)
--purge purge (i.e. hard-delete previous versions) the matched
documents
--expunge expunge (i.e. hard-delete any version and formats) the
matched documents
--with-versions=VERSION
specifies the version(s) to be used with hard-delete,
hide, revert, e.g.: 1-2,3 or all
--with-format=FORMAT
to specify a format when
appending/revising/deleting/reverting a document, e.g.
"pdf"
--with-hide-previous
when revising, hides previous versions
Actions for housekeeping:
--check-md5 check md5 checksum validity of files
--check-format check if any format-related inconsistences exists
--check-duplicate-docnames
check for duplicate docnames associated with the same
record
--update-md5 update md5 checksum of files
--fix-all fix inconsistences in filesystem vs database vs MARC
--fix-marc synchronize MARC after filesystem/database
--fix-format fix format related inconsistences
--fix-duplicate-docnames
fix duplicate docnames associated with the same record
Experimental options (do not expect to find them in the next release):
--set-icon=URL/PATH
attache the specified icon to the matched documents
--unset-icon remove any icon on the matched documents
--textify extract text from matched documents and store it for
later indexing
--with-ocr=yes/no/always
when used with --textify, wether to perform OCR (yes
will perform it only if necessary, based on an
heuristic)
Examples:
$ bibdocfile --append foo.tar.gz --recid=1
$ bibdocfile --revise http://foo.com?search=123 --with-docname='sam'
--format=pdf --recid=3 --set-docname='pippo' # revise for record 3
$ bibdocfile --delete *sam --all # delete all documents starting ending
$ bibdocfile --undelete -c "Test Collection" # undelete documents for
$ bibdocfile --get-info --recids=1-4,6-8 # obtain informations
$ bibdocfile -r 1 --with-docname=foo --set-docname=bar # Rename a document
Usage: python /opt/invenio/lib/python/invenio/websubmit_file_converter.py [options]
Options:
-h, --help show this help message and exit
-c FILE, --convert=FILE
convert the specified FILE
-d, --debug Enable debug information
--special-pdf2hocr2pdf=FILE
convert the given scanned PDF into a PDF with OCRed
text
-f FORMAT, --format=FORMAT
the desired output format
-o OUTPUT_NAME, --output=OUTPUT_NAME
the desired output FILE (if not specified a new file
will be generated with the desired output format)
--without-pdfa don't force creation of PDF/A PDFs
--without-pdfopt don't force optimization of PDFs files
--without-ocr don't force OCR
--can-convert=FORMAT display all the possible format that is possible to
generate from the given format
--is-ocr-needed=FILE check if OCR is needed for the FILE specified
-t TITLE, --title=TITLE
specify the title (used when creating PDFs)
-l LN, --language=LN specify the language (used when performing OCR, e.g.
en, it, fr...)
Examples:
python /opt/invenio/lib/python/invenio/websubmit_file_converter.py \
--convert=foo.docx -f pdf
python /opt/invenio/lib/python/invenio/websubmit_file_converter.py \
--special-pdf2hocr2pdf=scanned-foo.pdf --output=ocred-foo.pdf
This module is used internally by Invenio to convert from one format to another
It can also be invoked manually by a system administrator to obtain the same level of conversion quality offered by Invenio.
Usage:
python /opt/invenio/lib/python/invenio/websubmit_icon_creator.py \
[options] input-file.jpg
websubmit_icon_creator.py is used to create an icon for an image.
Options:
-h, --help Print this help.
-V, --version Print version information.
-v, --verbose=LEVEL Verbose level (0=min, 1=default, 9=max).
[NOT IMPLEMENTED]
-s, --icon-scale
Scaling information for the icon that is to
be created. Must be an integer. Defaults to
180.
-m, --multipage-icon
A flag to indicate that the icon should
consist of multiple pages. Will only be
respected if the requested icon type is GIF
and the input file is a PS or PDF consisting
of several pages.
-d, --multipage-icon-delay=VAL
If the icon consists of several pages and is
an animated GIF, a delay between frames can
be specified. Must be an integer. Defaults
to 100.
-f, --icon-file-format=FORMAT
The file format of the icon to be created.
Must be one of:
[pdf, gif, jpg, jpeg, ps, png, bmp]
Defaults to gif.
-o, --icon-name=XYZ
The optional name to be given to the created
icon file. If this is omitted, the icon file
will be given the same name as the input
file, but will be prefixed by "icon-";
Examples:
python /opt/invenio/lib/python/invenio/websubmit_icon_creator.py \
--icon-scale=200 \
--icon-name=test-icon \
--icon-file-format=jpg \
test-image.jpg
python /opt/invenio/lib/python/invenio/websubmit_icon_creator.py \
--icon-scale=200 \
--icon-name=test-icon2 \
--icon-file-format=gif \
--multipage-icon \
--multipage-icon-delay=50 \
test-image2.pdf
Usage:
python /opt/invenio/lib/python/invenio/websubmit_file_stamper.py \
[options] input-file.pdf
websubmit_file_stamper.py is used to add a "stamp" to a PDF file.
A LaTeX template is used to create the stamp and this stamp is then
concatenated with the original PDF file.
The stamp can take the form of either a separate "cover page" that is
appended to the document; or a "mark" that is applied somewhere either
on the document's first page or on all of its pages.
Options:
-h, --help Print this help.
-V, --version Print version information.
-v, --verbose=LEVEL Verbose level (0=min, 1=default, 9=max).
[NOT IMPLEMENTED]
-t, --latex-template=PATH
Path to the LaTeX template file that should be used
for the creation of the PDF stamp. (Note, if it's
just a basename, it will be sought first in the
current working directory, and then in the invenio
file-stamper templates directory; If there is a
qualifying path to the template name, it will be
sought only in that location);
-c, --latex-template-var='VARNAME=VALUE'
A variable that should be replaced in the LaTeX
template file with its corresponding value. Of the
following format:
VARNAME=VALUE
This option is repeatable - one for each template
variable;
-s, --stamp=STAMP-TYPE
The type of stamp to be applied to the subject
file. Must be one of 3 values:
+ "first" - stamp only the first page;
+ "all" - stamp all pages;
+ "coverpage" - add a cover page to the
document;
The default value is "first";
-o, --output-file=XYZ
The optional name to be given to the finished
(stamped) file. If this is omitted, the stamped
file will be given the same name as the input
file, but will be prefixed by "stamped-";
Example:
python /opt/invenio/lib/python/invenio/websubmit_file_stamper.py \
--latex-template=demo-stamp-left.tex \
--latex-template-var='REPORTNUMBER=TEST-THESIS-2008-019' \
--latex-template-var='DATE=27/02/2008' \
--stamp='first' \
--output-file=testfile_stamped.pdf \
testfile.pdf