Harvest for Austria Species List Created 01 Oct 23:51

Stage: completed
Fetched: 01 Oct 23:51
Validated: 01 Oct 23:51
Deltas Created 01 Oct 23:51
Units Normalized: 02 Oct 00:39
Ancestry Built: 02 Oct 00:01
Nodes Matched: 02 Oct 00:37
Names Parsed: 02 Oct 00:01
New Models Stored: 01 Oct 23:56
Indexed: 02 Oct 00:39
Completed: 02 Oct 00:52
Time to Harvest: 1 minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-10-01 23:51:10 -0400 by logger.rb/56815
[START] [2019-10-01 23:51:10] logged process
[START] [2019-10-01 23:51:10] create_harvest_instance
[STOP] [2019-10-01 23:51:11] create_harvest_instance
[START] [2019-10-01 23:51:11] fetch_files
[STOP] [2019-10-01 23:51:11] fetch_files
[START] [2019-10-01 23:51:11] validate_each_file
[STOP] [2019-10-01 23:51:19] validate_each_file
[START] [2019-10-01 23:51:19] convert_to_csv
[CMD] [2019-10-01 23:51:19] /usr/bin/sort /app/public/converted_csv/austria_sp_list_refs_14850.csv > /app/public/converted_csv/austria_sp_list_refs_14850.csv_sorted
[CMD] [2019-10-01 23:51:21] /usr/bin/sort /app/public/converted_csv/austria_sp_list_nodes_14851.csv > /app/public/converted_csv/austria_sp_list_nodes_14851.csv_sorted
[CMD] [2019-10-01 23:51:22] /usr/bin/sort /app/public/converted_csv/austria_sp_list_occurrences_14852.csv > /app/public/converted_csv/austria_sp_list_occurrences_14852.csv_sorted
[CMD] [2019-10-01 23:51:24] /usr/bin/sort /app/public/converted_csv/austria_sp_list_measurements_14853.csv > /app/public/converted_csv/austria_sp_list_measurements_14853.csv_sorted
[STOP] [2019-10-01 23:51:26] convert_to_csv
[START] [2019-10-01 23:51:26] calculate_delta
[CMD] [2019-10-01 23:51:26] echo "0a" > /app/public/diff/austria_sp_list_refs_14850.diff
[CMD] [2019-10-01 23:51:27] tail -n +1 /app/public/converted_csv/austria_sp_list_refs_14850.csv >> /app/public/diff/austria_sp_list_refs_14850.diff
[CMD] [2019-10-01 23:51:29] echo "." >> /app/public/diff/austria_sp_list_refs_14850.diff
[CMD] [2019-10-01 23:51:30] echo "0a" > /app/public/diff/austria_sp_list_nodes_14851.diff
[CMD] [2019-10-01 23:51:32] tail -n +1 /app/public/converted_csv/austria_sp_list_nodes_14851.csv >> /app/public/diff/austria_sp_list_nodes_14851.diff
[CMD] [2019-10-01 23:51:33] echo "." >> /app/public/diff/austria_sp_list_nodes_14851.diff
[CMD] [2019-10-01 23:51:35] echo "0a" > /app/public/diff/austria_sp_list_occurrences_14852.diff
[CMD] [2019-10-01 23:51:36] tail -n +1 /app/public/converted_csv/austria_sp_list_occurrences_14852.csv >> /app/public/diff/austria_sp_list_occurrences_14852.diff
[CMD] [2019-10-01 23:51:38] echo "." >> /app/public/diff/austria_sp_list_occurrences_14852.diff
[CMD] [2019-10-01 23:51:39] echo "0a" > /app/public/diff/austria_sp_list_measurements_14853.diff
[CMD] [2019-10-01 23:51:41] tail -n +1 /app/public/converted_csv/austria_sp_list_measurements_14853.csv >> /app/public/diff/austria_sp_list_measurements_14853.diff
[CMD] [2019-10-01 23:51:42] echo "." >> /app/public/diff/austria_sp_list_measurements_14853.diff
[STOP] [2019-10-01 23:51:44] calculate_delta
[START] [2019-10-01 23:51:44] parse_diff_and_store
[INFO] [2019-10-01 23:51:45] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-01 23:51:47] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-01 23:52:04] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-01 23:52:10] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-01 23:55:21] Storing 2 References
[INFO] [2019-10-01 23:55:21] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-01 23:55:21] Average Time: 0.0
[INFO] [2019-10-01 23:55:21] Total Time: 1s
[INFO] [2019-10-01 23:55:21] Storing 46587 ScientificNames
[INFO] [2019-10-01 23:55:21] Processing group of 46587 in 47 groups of 1000
[INFO] [2019-10-01 23:55:38] Average Time: 0.37
[INFO] [2019-10-01 23:55:38] Total Time: 18s
[INFO] [2019-10-01 23:55:38] last 3 / first 3: 0.79
[INFO] [2019-10-01 23:55:38] Std.Dev: 0.11401754250991379; Max: 0.81
[INFO] [2019-10-01 23:55:38] Storing 46587 Nodes
[INFO] [2019-10-01 23:55:38] Processing group of 46587 in 47 groups of 1000
[INFO] [2019-10-01 23:55:55] Average Time: 0.342
[INFO] [2019-10-01 23:55:55] Total Time: 17s
[INFO] [2019-10-01 23:55:55] last 3 / first 3: 0.94
[INFO] [2019-10-01 23:55:55] Std.Dev: 0.14491376746189438; Max: 1.2
[INFO] [2019-10-01 23:55:55] Storing 34222 Occurrences
[INFO] [2019-10-01 23:55:55] Processing group of 34222 in 35 groups of 1000
[INFO] [2019-10-01 23:55:59] Average Time: 0.112
[INFO] [2019-10-01 23:55:59] Total Time: 5s
[INFO] [2019-10-01 23:55:59] last 3 / first 3: 0.87
[INFO] [2019-10-01 23:55:59] Std.Dev: 0.03162277660168379; Max: 0.22
[INFO] [2019-10-01 23:55:59] Storing 68710 TraitsReferences
[INFO] [2019-10-01 23:55:59] Processing group of 68710 in 69 groups of 1000
[INFO] [2019-10-01 23:56:06] Average Time: 0.093
[INFO] [2019-10-01 23:56:06] Total Time: 7s
[INFO] [2019-10-01 23:56:06] last 3 / first 3: 0.67
[INFO] [2019-10-01 23:56:06] Std.Dev: 0.08944271909999159; Max: 0.77
[INFO] [2019-10-01 23:56:06] Storing 68709 Traits
[INFO] [2019-10-01 23:56:06] Processing group of 68709 in 69 groups of 1000
[INFO] [2019-10-01 23:56:34] Average Time: 0.41
[INFO] [2019-10-01 23:56:34] Total Time: 29s
[INFO] [2019-10-01 23:56:34] last 3 / first 3: 0.77
[INFO] [2019-10-01 23:56:34] Std.Dev: 0.372827037646145; Max: 2.36
[INFO] [2019-10-01 23:56:34] Storing 68660 MetaTraits
[INFO] [2019-10-01 23:56:34] Processing group of 68660 in 69 groups of 1000
[INFO] [2019-10-01 23:56:51] Average Time: 0.236
[INFO] [2019-10-01 23:56:51] Total Time: 17s
[INFO] [2019-10-01 23:56:51] last 3 / first 3: 0.71
[INFO] [2019-10-01 23:56:51] Std.Dev: 0.4449719092257398; Max: 2.32
[STOP] [2019-10-01 23:56:51] parse_diff_and_store
[START] [2019-10-01 23:56:51] resolve_keys
[INFO] [2019-10-01 23:58:46] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-01 23:58:56] traits to occurrences...
[INFO] [2019-10-01 23:59:04] traits to nodes (through occurrences)...
[INFO] [2019-10-01 23:59:05] Traits to sex term...
[INFO] [2019-10-01 23:59:11] Traits to lifestage term...
[INFO] [2019-10-01 23:59:19] MetaTraits to traits...
[INFO] [2019-10-01 23:59:23] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-01 23:59:33] Assocs to occurrences...
[INFO] [2019-10-01 23:59:33] Assocs to nodes...
[INFO] [2019-10-01 23:59:33] Assoc to sex term...
[INFO] [2019-10-01 23:59:33] Assoc to lifestage term...
[STOP] [2019-10-01 23:59:33] resolve_keys
[START] [2019-10-01 23:59:33] hold_for_later_1
[STOP] [2019-10-01 23:59:33] hold_for_later_1
[START] [2019-10-01 23:59:33] hold_for_later_2
[STOP] [2019-10-01 23:59:33] hold_for_later_2
[START] [2019-10-01 23:59:33] resolve_missing_parents
[STOP] [2019-10-02 00:00:39] resolve_missing_parents
[START] [2019-10-02 00:00:39] rebuild_nodes
[START] [2019-10-02 00:00:39] Flattener#flatten
[START] [2019-10-02 00:00:39] Flattener#study_resource
[START] [2019-10-02 00:00:39] Flattener#build_ancestry
[STOP] [2019-10-02 00:00:46] Flattener#build_ancestry
[INFO] [2019-10-02 00:00:46] 46587 ancestry keys
[START] [2019-10-02 00:00:46] build_node_ancestors
[INFO] [2019-10-02 00:00:46] old ancestors deleted.
[STOP] [2019-10-02 00:01:04] build_node_ancestors
[START] [2019-10-02 00:01:09] Flattener#propagate_ancestor_ids
[STOP] [2019-10-02 00:01:13] Flattener#propagate_ancestor_ids
[STOP] [2019-10-02 00:01:13] Flattener#flatten
[STOP] [2019-10-02 00:01:13] rebuild_nodes
[START] [2019-10-02 00:01:13] resolve_missing_media_owners
[STOP] [2019-10-02 00:01:13] resolve_missing_media_owners
[START] [2019-10-02 00:01:13] sanitize_media_verbatims
[STOP] [2019-10-02 00:01:13] sanitize_media_verbatims
[START] [2019-10-02 00:01:13] queue_downloads
[STOP] [2019-10-02 00:01:13] queue_downloads
[START] [2019-10-02 00:01:13] parse_names
[WARN] [2019-10-02 00:01:13] I see 46587 names which still need to be parsed.
[STOP] [2019-10-02 00:01:47] parse_names
[START] [2019-10-02 00:01:47] denormalize_canonical_names_to_nodes
[STOP] [2019-10-02 00:01:47] denormalize_canonical_names_to_nodes
[START] [2019-10-02 00:01:47] match_nodes
[START] [2019-10-02 00:01:47] map_all_nodes_to_pages
[STOP] [2019-10-02 00:37:33] map_all_nodes_to_pages
[INFO] [2019-10-02 00:37:33] 6702 Unmatched nodes (of 46587)! That's too many to output. First 10: Ardea albus (#47640495); Carduelis chloris (#47617319); Carduelis spinus (#47617805); Carduelis cannabina (#47618831); Carduelis flammea (#47619189); Carduelis flavirostris (#47652207); Parus caeruleus (#47617378); Parus ater (#47617471); Parus palustris (#47617623); Parus cristatus (#47617972)
[START] [2019-10-02 00:37:33] update_nodes
[STOP] [2019-10-02 00:37:49] update_nodes
[STOP] [2019-10-02 00:37:49] match_nodes
[START] [2019-10-02 00:37:49] reindex_search
[STOP] [2019-10-02 00:39:33] reindex_search
[START] [2019-10-02 00:39:33] normalize_units
[STOP] [2019-10-02 00:39:33] normalize_units
[START] [2019-10-02 00:39:33] calculate_statistics
[STOP] [2019-10-02 00:39:33] calculate_statistics
[START] [2019-10-02 00:39:33] complete_harvest_instance
[START] [2019-10-02 00:39:33] overall_tsv_creation
[INFO] [2019-10-02 00:39:33] Processing group of 46587 in 5 batches of 10000
[INFO] [2019-10-02 00:40:58] 6318 Traits (unfiltered)...
[INFO] [2019-10-02 00:41:11] 6318 Traits (filtered)...
[INFO] [2019-10-02 00:41:11] 0 Associations (filtered)...
[INFO] [2019-10-02 00:41:59] 31581 metadata added.
[INFO] [2019-10-02 00:41:59] 0 metadata added.
[INFO] [2019-10-02 00:43:28] 7359 Traits (unfiltered)...
[INFO] [2019-10-02 00:43:41] 7359 Traits (filtered)...
[INFO] [2019-10-02 00:43:41] 0 Associations (filtered)...
[INFO] [2019-10-02 00:44:32] 36791 metadata added.
[INFO] [2019-10-02 00:44:32] 0 metadata added.
[INFO] [2019-10-02 00:46:06] 7834 Traits (unfiltered)...
[INFO] [2019-10-02 00:46:19] 7834 Traits (filtered)...
[INFO] [2019-10-02 00:46:19] 0 Associations (filtered)...
[INFO] [2019-10-02 00:47:13] 39164 metadata added.
[INFO] [2019-10-02 00:47:13] 0 metadata added.
[INFO] [2019-10-02 00:48:46] 7774 Traits (unfiltered)...
[INFO] [2019-10-02 00:48:59] 7774 Traits (filtered)...
[INFO] [2019-10-02 00:48:59] 0 Associations (filtered)...
[INFO] [2019-10-02 00:49:52] 38852 metadata added.
[INFO] [2019-10-02 00:49:52] 0 metadata added.
[INFO] [2019-10-02 00:51:08] 4937 Traits (unfiltered)...
[INFO] [2019-10-02 00:51:20] 4937 Traits (filtered)...
[INFO] [2019-10-02 00:51:20] 0 Associations (filtered)...
[INFO] [2019-10-02 00:52:08] 24674 metadata added.
[INFO] [2019-10-02 00:52:08] 0 metadata added.
[INFO] [2019-10-02 00:52:08] Average Time: 124.148
[INFO] [2019-10-02 00:52:08] Total Time: 12m35s
[STOP] [2019-10-02 00:52:08] overall_tsv_creation
[INFO] [2019-10-02 00:52:08] Done. Check your files:
[INFO] [2019-10-02 00:52:09] (46587 lines) /app/public/data/austria_sp_list/publish_nodes.tsv
[INFO] [2019-10-02 00:52:11] (168570 lines) /app/public/data/austria_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-02 00:52:12] (46587 lines) /app/public/data/austria_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-02 00:52:14] (34223 lines) /app/public/data/austria_sp_list/publish_traits.tsv
[INFO] [2019-10-02 00:52:15] (171063 lines) /app/public/data/austria_sp_list/publish_metadata.tsv
[STOP] [2019-10-02 00:52:16] complete_harvest_instance
[START] [2019-10-02 00:52:16] completed
[STOP] [2019-10-02 00:52:16] completed
[STOP] [2019-10-02 00:52:16] logged process, took 3665.19

Latest Process