Harvest for Norway Species List Created 15 Oct 04:23

Stage: completed
Fetched: 15 Oct 04:23
Validated: 15 Oct 04:24
Deltas Created 15 Oct 04:24
Units Normalized: 15 Oct 05:25
Ancestry Built: 15 Oct 04:34
Nodes Matched: 15 Oct 05:23
Names Parsed: 15 Oct 04:34
New Models Stored: 15 Oct 04:29
Indexed: 15 Oct 05:25
Completed: 15 Oct 05:41
Time to Harvest: 1 minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-10-15 04:23:54 -0400 by logger.rb/56815
[START] [2019-10-15 04:23:54] logged process
[START] [2019-10-15 04:23:54] create_harvest_instance
[STOP] [2019-10-15 04:23:55] create_harvest_instance
[START] [2019-10-15 04:23:55] fetch_files
[STOP] [2019-10-15 04:23:55] fetch_files
[START] [2019-10-15 04:23:55] validate_each_file
[STOP] [2019-10-15 04:24:01] validate_each_file
[START] [2019-10-15 04:24:01] convert_to_csv
[CMD] [2019-10-15 04:24:01] /usr/bin/sort /app/public/converted_csv/norway_sp_list_refs_16821.csv > /app/public/converted_csv/norway_sp_list_refs_16821.csv_sorted
[CMD] [2019-10-15 04:24:01] /usr/bin/sort /app/public/converted_csv/norway_sp_list_nodes_16822.csv > /app/public/converted_csv/norway_sp_list_nodes_16822.csv_sorted
[CMD] [2019-10-15 04:24:02] /usr/bin/sort /app/public/converted_csv/norway_sp_list_occurrences_16823.csv > /app/public/converted_csv/norway_sp_list_occurrences_16823.csv_sorted
[CMD] [2019-10-15 04:24:02] /usr/bin/sort /app/public/converted_csv/norway_sp_list_measurements_16824.csv > /app/public/converted_csv/norway_sp_list_measurements_16824.csv_sorted
[STOP] [2019-10-15 04:24:02] convert_to_csv
[START] [2019-10-15 04:24:02] calculate_delta
[CMD] [2019-10-15 04:24:02] echo "0a" > /app/public/diff/norway_sp_list_refs_16821.diff
[CMD] [2019-10-15 04:24:02] tail -n +1 /app/public/converted_csv/norway_sp_list_refs_16821.csv >> /app/public/diff/norway_sp_list_refs_16821.diff
[CMD] [2019-10-15 04:24:03] echo "." >> /app/public/diff/norway_sp_list_refs_16821.diff
[CMD] [2019-10-15 04:24:03] echo "0a" > /app/public/diff/norway_sp_list_nodes_16822.diff
[CMD] [2019-10-15 04:24:03] tail -n +1 /app/public/converted_csv/norway_sp_list_nodes_16822.csv >> /app/public/diff/norway_sp_list_nodes_16822.diff
[CMD] [2019-10-15 04:24:04] echo "." >> /app/public/diff/norway_sp_list_nodes_16822.diff
[CMD] [2019-10-15 04:24:04] echo "0a" > /app/public/diff/norway_sp_list_occurrences_16823.diff
[CMD] [2019-10-15 04:24:04] tail -n +1 /app/public/converted_csv/norway_sp_list_occurrences_16823.csv >> /app/public/diff/norway_sp_list_occurrences_16823.diff
[CMD] [2019-10-15 04:24:04] echo "." >> /app/public/diff/norway_sp_list_occurrences_16823.diff
[CMD] [2019-10-15 04:24:05] echo "0a" > /app/public/diff/norway_sp_list_measurements_16824.diff
[CMD] [2019-10-15 04:24:05] tail -n +1 /app/public/converted_csv/norway_sp_list_measurements_16824.csv >> /app/public/diff/norway_sp_list_measurements_16824.diff
[CMD] [2019-10-15 04:24:05] echo "." >> /app/public/diff/norway_sp_list_measurements_16824.diff
[STOP] [2019-10-15 04:24:06] calculate_delta
[START] [2019-10-15 04:24:06] parse_diff_and_store
[INFO] [2019-10-15 04:24:06] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-15 04:24:06] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-15 04:24:26] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-15 04:24:31] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-15 04:28:02] Storing 2 References
[INFO] [2019-10-15 04:28:02] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-15 04:28:02] Average Time: 0.0
[INFO] [2019-10-15 04:28:02] Total Time: 1s
[INFO] [2019-10-15 04:28:02] Storing 50333 ScientificNames
[INFO] [2019-10-15 04:28:02] Processing group of 50333 in 51 groups of 1000
[INFO] [2019-10-15 04:28:22] Average Time: 0.392
[INFO] [2019-10-15 04:28:22] Total Time: 21s
[INFO] [2019-10-15 04:28:22] last 3 / first 3: 0.78
[INFO] [2019-10-15 04:28:22] Std.Dev: 0.11832159566199232; Max: 0.85
[INFO] [2019-10-15 04:28:22] Storing 50333 Nodes
[INFO] [2019-10-15 04:28:22] Processing group of 50333 in 51 groups of 1000
[INFO] [2019-10-15 04:28:40] Average Time: 0.348
[INFO] [2019-10-15 04:28:40] Total Time: 18s
[INFO] [2019-10-15 04:28:40] last 3 / first 3: 1.16
[INFO] [2019-10-15 04:28:40] Std.Dev: 0.17888543819998318; Max: 1.5
[INFO] [2019-10-15 04:28:40] Storing 35668 Occurrences
[INFO] [2019-10-15 04:28:40] Processing group of 35668 in 36 groups of 1000
[INFO] [2019-10-15 04:28:46] Average Time: 0.158
[INFO] [2019-10-15 04:28:46] Total Time: 6s
[INFO] [2019-10-15 04:28:46] last 3 / first 3: 0.43
[INFO] [2019-10-15 04:28:46] Std.Dev: 0.18973665961010275; Max: 1.24
[INFO] [2019-10-15 04:28:46] Storing 71790 TraitsReferences
[INFO] [2019-10-15 04:28:46] Processing group of 71790 in 72 groups of 1000
[INFO] [2019-10-15 04:28:52] Average Time: 0.071
[INFO] [2019-10-15 04:28:52] Total Time: 6s
[INFO] [2019-10-15 04:28:52] last 3 / first 3: 0.59
[INFO] [2019-10-15 04:28:52] Std.Dev: 0.0; Max: 0.17
[INFO] [2019-10-15 04:28:52] Storing 71789 Traits
[INFO] [2019-10-15 04:28:52] Processing group of 71789 in 72 groups of 1000
[INFO] [2019-10-15 04:29:16] Average Time: 0.338
[INFO] [2019-10-15 04:29:16] Total Time: 25s
[INFO] [2019-10-15 04:29:16] last 3 / first 3: 0.96
[INFO] [2019-10-15 04:29:16] Std.Dev: 0.12649110640673517; Max: 1.12
[INFO] [2019-10-15 04:29:16] Storing 71769 MetaTraits
[INFO] [2019-10-15 04:29:16] Processing group of 71769 in 72 groups of 1000
[INFO] [2019-10-15 04:29:29] Average Time: 0.176
[INFO] [2019-10-15 04:29:29] Total Time: 14s
[INFO] [2019-10-15 04:29:29] last 3 / first 3: 0.55
[INFO] [2019-10-15 04:29:29] Std.Dev: 0.382099463490856; Max: 2.79
[STOP] [2019-10-15 04:29:29] parse_diff_and_store
[START] [2019-10-15 04:29:29] resolve_keys
[INFO] [2019-10-15 04:31:36] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-15 04:31:45] traits to occurrences...
[INFO] [2019-10-15 04:31:52] traits to nodes (through occurrences)...
[INFO] [2019-10-15 04:31:53] Traits to sex term...
[INFO] [2019-10-15 04:31:59] Traits to lifestage term...
[INFO] [2019-10-15 04:32:05] MetaTraits to traits...
[INFO] [2019-10-15 04:32:10] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-15 04:32:20] Assocs to occurrences...
[INFO] [2019-10-15 04:32:20] Assocs to nodes...
[INFO] [2019-10-15 04:32:20] Assoc to sex term...
[INFO] [2019-10-15 04:32:20] Assoc to lifestage term...
[STOP] [2019-10-15 04:32:20] resolve_keys
[START] [2019-10-15 04:32:20] hold_for_later_1
[STOP] [2019-10-15 04:32:20] hold_for_later_1
[START] [2019-10-15 04:32:20] hold_for_later_2
[STOP] [2019-10-15 04:32:20] hold_for_later_2
[START] [2019-10-15 04:32:20] resolve_missing_parents
[STOP] [2019-10-15 04:33:29] resolve_missing_parents
[START] [2019-10-15 04:33:29] rebuild_nodes
[START] [2019-10-15 04:33:29] Flattener#flatten
[START] [2019-10-15 04:33:29] Flattener#study_resource
[START] [2019-10-15 04:33:29] Flattener#build_ancestry
[STOP] [2019-10-15 04:33:38] Flattener#build_ancestry
[INFO] [2019-10-15 04:33:38] 50333 ancestry keys
[START] [2019-10-15 04:33:38] build_node_ancestors
[INFO] [2019-10-15 04:33:38] old ancestors deleted.
[STOP] [2019-10-15 04:33:58] build_node_ancestors
[START] [2019-10-15 04:34:04] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 04:34:09] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 04:34:09] Flattener#flatten
[STOP] [2019-10-15 04:34:09] rebuild_nodes
[START] [2019-10-15 04:34:09] resolve_missing_media_owners
[STOP] [2019-10-15 04:34:09] resolve_missing_media_owners
[START] [2019-10-15 04:34:09] sanitize_media_verbatims
[STOP] [2019-10-15 04:34:09] sanitize_media_verbatims
[START] [2019-10-15 04:34:09] queue_downloads
[STOP] [2019-10-15 04:34:09] queue_downloads
[START] [2019-10-15 04:34:09] parse_names
[WARN] [2019-10-15 04:34:09] I see 50333 names which still need to be parsed.
[STOP] [2019-10-15 04:34:48] parse_names
[START] [2019-10-15 04:34:48] denormalize_canonical_names_to_nodes
[STOP] [2019-10-15 04:34:49] denormalize_canonical_names_to_nodes
[START] [2019-10-15 04:34:49] match_nodes
[START] [2019-10-15 04:34:49] map_all_nodes_to_pages
[STOP] [2019-10-15 05:23:19] map_all_nodes_to_pages
[INFO] [2019-10-15 05:23:19] 5825 Unmatched nodes (of 50333)! That's too many to output. First 10: Anas penelope (#51344464); Anas clypeata (#51344682); Anas strepera (#51344740); Anas querquedula (#51344826); Anas americana (#51346840); Anas discors (#51353527); Anas formosa (#51369320); Anas cyanoptera (#51373409); Anas sibilatrix (#51373876); Anas falcata (#51384757)
[START] [2019-10-15 05:23:19] update_nodes
[STOP] [2019-10-15 05:23:37] update_nodes
[STOP] [2019-10-15 05:23:37] match_nodes
[START] [2019-10-15 05:23:37] reindex_search
[STOP] [2019-10-15 05:25:43] reindex_search
[START] [2019-10-15 05:25:43] normalize_units
[STOP] [2019-10-15 05:25:43] normalize_units
[START] [2019-10-15 05:25:43] calculate_statistics
[STOP] [2019-10-15 05:25:43] calculate_statistics
[START] [2019-10-15 05:25:43] complete_harvest_instance
[START] [2019-10-15 05:25:43] overall_tsv_creation
[INFO] [2019-10-15 05:25:43] Processing group of 50333 in 6 batches of 10000
[INFO] [2019-10-15 05:27:13] 5975 Traits (unfiltered)...
[INFO] [2019-10-15 05:27:27] 5975 Traits (filtered)...
[INFO] [2019-10-15 05:27:27] 0 Associations (filtered)...
[INFO] [2019-10-15 05:28:16] 29874 metadata added.
[INFO] [2019-10-15 05:28:16] 0 metadata added.
[INFO] [2019-10-15 05:29:49] 6958 Traits (unfiltered)...
[INFO] [2019-10-15 05:30:04] 6958 Traits (filtered)...
[INFO] [2019-10-15 05:30:04] 0 Associations (filtered)...
[INFO] [2019-10-15 05:31:03] 34789 metadata added.
[INFO] [2019-10-15 05:31:03] 0 metadata added.
[INFO] [2019-10-15 05:32:38] 7354 Traits (unfiltered)...
[INFO] [2019-10-15 05:32:52] 7354 Traits (filtered)...
[INFO] [2019-10-15 05:32:52] 0 Associations (filtered)...
[INFO] [2019-10-15 05:33:50] 36764 metadata added.
[INFO] [2019-10-15 05:33:50] 0 metadata added.
[INFO] [2019-10-15 05:35:28] 7575 Traits (unfiltered)...
[INFO] [2019-10-15 05:35:42] 7575 Traits (filtered)...
[INFO] [2019-10-15 05:35:42] 0 Associations (filtered)...
[INFO] [2019-10-15 05:36:37] 37870 metadata added.
[INFO] [2019-10-15 05:36:37] 0 metadata added.
[INFO] [2019-10-15 05:38:11] 7771 Traits (unfiltered)...
[INFO] [2019-10-15 05:38:25] 7771 Traits (filtered)...
[INFO] [2019-10-15 05:38:25] 0 Associations (filtered)...
[INFO] [2019-10-15 05:39:24] 38847 metadata added.
[INFO] [2019-10-15 05:39:24] 0 metadata added.
[INFO] [2019-10-15 05:40:11] 35 Traits (unfiltered)...
[INFO] [2019-10-15 05:40:25] 35 Traits (filtered)...
[INFO] [2019-10-15 05:40:25] 0 Associations (filtered)...
[INFO] [2019-10-15 05:41:02] 175 metadata added.
[INFO] [2019-10-15 05:41:02] 0 metadata added.
[INFO] [2019-10-15 05:41:02] Average Time: 126.945
[INFO] [2019-10-15 05:41:02] Total Time: 15m20s
[STOP] [2019-10-15 05:41:02] overall_tsv_creation
[INFO] [2019-10-15 05:41:02] Done. Check your files:
[INFO] [2019-10-15 05:41:03] (50333 lines) /app/public/data/norway_sp_list/publish_nodes.tsv
[INFO] [2019-10-15 05:41:03] (179105 lines) /app/public/data/norway_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-15 05:41:03] (50333 lines) /app/public/data/norway_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-15 05:41:04] (35669 lines) /app/public/data/norway_sp_list/publish_traits.tsv
[INFO] [2019-10-15 05:41:04] (178320 lines) /app/public/data/norway_sp_list/publish_metadata.tsv
[STOP] [2019-10-15 05:41:04] complete_harvest_instance
[START] [2019-10-15 05:41:04] completed
[STOP] [2019-10-15 05:41:04] completed
[STOP] [2019-10-15 05:41:04] logged process, took 4629.74

Latest Process