Crawling data

If we want to share a database of our own, either as an official release, or as a valuable product that will be used for the community we have to use the --crawl_my_data command. This command updates and reindexes all the databases that are pending under certain path.

Per default it loads the whole users’ projectdata directory /path_to/projectdata/user-<username>/:

$ freva --crawl_my_data

But if we already have a lot of data it is convenient (and much faster) to take just a sub-directory, for example

$ freva --crawl_my_data --path=/path_to/projectdata/user-<username>/observations/
Please wait while the system is crawling your data
Sending last 3 entries and 3 entries to latest core

success
0
Finished.
Crawling took 2.5140349865 seconds

The added datasets are now searchable under the project=user-<username> facet either by –databrowser command line or via website’s databrowser.

Note

be aware of the correct path!

in order to crawl your own dataset you have to previously set your folder paths correctly, for that you will need to consult your project’s structure in that regard.