Stage:
completed
Fetched:
14 Oct 09:28
Validated:
14 Oct 09:28
Deltas Created
14 Oct 09:28
Units Normalized:
14 Oct 09:42
Ancestry Built:
14 Oct 09:30
Nodes Matched:
14 Oct 09:41
Names Parsed:
14 Oct 09:30
New Models Stored:
14 Oct 09:29
Indexed:
14 Oct 09:42
Completed:
14 Oct 09:46
Time to Harvest:
less than a minute
Harvesting Log
(156 lines)
# Logfile created on 2019-10-14 09:28:04 -0400 by logger.rb/56815
[START] [2019-10-14 09:28:04] logged process
[START] [2019-10-14 09:28:04] create_harvest_instance
[STOP] [2019-10-14 09:28:04] create_harvest_instance
[START] [2019-10-14 09:28:04] fetch_files
[STOP] [2019-10-14 09:28:04] fetch_files
[START] [2019-10-14 09:28:04] validate_each_file
[STOP] [2019-10-14 09:28:06] validate_each_file
[START] [2019-10-14 09:28:06] convert_to_csv
[CMD] [2019-10-14 09:28:06] /usr/bin/sort /app/public/converted_csv/morocco_sp_list_refs_16635.csv > /app/public/converted_csv/morocco_sp_list_refs_16635.csv_sorted
[CMD] [2019-10-14 09:28:06] /usr/bin/sort /app/public/converted_csv/morocco_sp_list_nodes_16636.csv > /app/public/converted_csv/morocco_sp_list_nodes_16636.csv_sorted
[CMD] [2019-10-14 09:28:06] /usr/bin/sort /app/public/converted_csv/morocco_sp_list_occurrences_16637.csv > /app/public/converted_csv/morocco_sp_list_occurrences_16637.csv_sorted
[CMD] [2019-10-14 09:28:06] /usr/bin/sort /app/public/converted_csv/morocco_sp_list_measurements_16638.csv > /app/public/converted_csv/morocco_sp_list_measurements_16638.csv_sorted
[STOP] [2019-10-14 09:28:06] convert_to_csv
[START] [2019-10-14 09:28:06] calculate_delta
[CMD] [2019-10-14 09:28:06] echo "0a" > /app/public/diff/morocco_sp_list_refs_16635.diff
[CMD] [2019-10-14 09:28:06] tail -n +1 /app/public/converted_csv/morocco_sp_list_refs_16635.csv >> /app/public/diff/morocco_sp_list_refs_16635.diff
[CMD] [2019-10-14 09:28:06] echo "." >> /app/public/diff/morocco_sp_list_refs_16635.diff
[CMD] [2019-10-14 09:28:06] echo "0a" > /app/public/diff/morocco_sp_list_nodes_16636.diff
[CMD] [2019-10-14 09:28:06] tail -n +1 /app/public/converted_csv/morocco_sp_list_nodes_16636.csv >> /app/public/diff/morocco_sp_list_nodes_16636.diff
[CMD] [2019-10-14 09:28:06] echo "." >> /app/public/diff/morocco_sp_list_nodes_16636.diff
[CMD] [2019-10-14 09:28:07] echo "0a" > /app/public/diff/morocco_sp_list_occurrences_16637.diff
[CMD] [2019-10-14 09:28:07] tail -n +1 /app/public/converted_csv/morocco_sp_list_occurrences_16637.csv >> /app/public/diff/morocco_sp_list_occurrences_16637.diff
[CMD] [2019-10-14 09:28:07] echo "." >> /app/public/diff/morocco_sp_list_occurrences_16637.diff
[CMD] [2019-10-14 09:28:07] echo "0a" > /app/public/diff/morocco_sp_list_measurements_16638.diff
[CMD] [2019-10-14 09:28:07] tail -n +1 /app/public/converted_csv/morocco_sp_list_measurements_16638.csv >> /app/public/diff/morocco_sp_list_measurements_16638.diff
[CMD] [2019-10-14 09:28:07] echo "." >> /app/public/diff/morocco_sp_list_measurements_16638.diff
[STOP] [2019-10-14 09:28:07] calculate_delta
[START] [2019-10-14 09:28:07] parse_diff_and_store
[INFO] [2019-10-14 09:28:07] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-14 09:28:07] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-14 09:28:12] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-14 09:28:13] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-14 09:28:57] Storing 2 References
[INFO] [2019-10-14 09:28:57] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-14 09:28:57] Average Time: 0.0
[INFO] [2019-10-14 09:28:57] Total Time: 1s
[INFO] [2019-10-14 09:28:57] Storing 11699 ScientificNames
[INFO] [2019-10-14 09:28:57] Processing group of 11699 in 12 groups of 1000
[INFO] [2019-10-14 09:29:02] Average Time: 0.378
[INFO] [2019-10-14 09:29:02] Total Time: 5s
[INFO] [2019-10-14 09:29:02] last 3 / first 3: 0.62
[INFO] [2019-10-14 09:29:02] Std.Dev: 0.1; Max: 0.65
[INFO] [2019-10-14 09:29:02] Storing 11699 Nodes
[INFO] [2019-10-14 09:29:02] Processing group of 11699 in 12 groups of 1000
[INFO] [2019-10-14 09:29:05] Average Time: 0.284
[INFO] [2019-10-14 09:29:05] Total Time: 4s
[INFO] [2019-10-14 09:29:05] last 3 / first 3: 0.91
[INFO] [2019-10-14 09:29:05] Std.Dev: 0.03162277660168379; Max: 0.33
[INFO] [2019-10-14 09:29:05] Storing 7230 Occurrences
[INFO] [2019-10-14 09:29:05] Processing group of 7230 in 8 groups of 1000
[INFO] [2019-10-14 09:29:06] Average Time: 0.093
[INFO] [2019-10-14 09:29:06] Total Time: 1s
[INFO] [2019-10-14 09:29:06] last 3 / first 3: 0.73
[INFO] [2019-10-14 09:29:06] Std.Dev: 0.03162277660168379; Max: 0.12
[INFO] [2019-10-14 09:29:06] Storing 15380 TraitsReferences
[INFO] [2019-10-14 09:29:06] Processing group of 15380 in 16 groups of 1000
[INFO] [2019-10-14 09:29:07] Average Time: 0.073
[INFO] [2019-10-14 09:29:07] Total Time: 2s
[INFO] [2019-10-14 09:29:07] last 3 / first 3: 0.57
[INFO] [2019-10-14 09:29:07] Std.Dev: 0.03162277660168379; Max: 0.15
[INFO] [2019-10-14 09:29:07] Storing 15379 Traits
[INFO] [2019-10-14 09:29:07] Processing group of 15379 in 16 groups of 1000
[INFO] [2019-10-14 09:29:12] Average Time: 0.321
[INFO] [2019-10-14 09:29:12] Total Time: 6s
[INFO] [2019-10-14 09:29:12] last 3 / first 3: 0.7
[INFO] [2019-10-14 09:29:12] Std.Dev: 0.10954451150103323; Max: 0.61
[INFO] [2019-10-14 09:29:12] Storing 15362 MetaTraits
[INFO] [2019-10-14 09:29:12] Processing group of 15362 in 16 groups of 1000
[INFO] [2019-10-14 09:29:14] Average Time: 0.103
[INFO] [2019-10-14 09:29:14] Total Time: 2s
[INFO] [2019-10-14 09:29:14] last 3 / first 3: 0.79
[INFO] [2019-10-14 09:29:14] Std.Dev: 0.0; Max: 0.13
[STOP] [2019-10-14 09:29:14] parse_diff_and_store
[START] [2019-10-14 09:29:14] resolve_keys
[INFO] [2019-10-14 09:30:00] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-14 09:30:04] traits to occurrences...
[INFO] [2019-10-14 09:30:09] traits to nodes (through occurrences)...
[INFO] [2019-10-14 09:30:10] Traits to sex term...
[INFO] [2019-10-14 09:30:14] Traits to lifestage term...
[INFO] [2019-10-14 09:30:18] MetaTraits to traits...
[INFO] [2019-10-14 09:30:19] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-14 09:30:21] Assocs to occurrences...
[INFO] [2019-10-14 09:30:21] Assocs to nodes...
[INFO] [2019-10-14 09:30:21] Assoc to sex term...
[INFO] [2019-10-14 09:30:21] Assoc to lifestage term...
[STOP] [2019-10-14 09:30:21] resolve_keys
[START] [2019-10-14 09:30:21] hold_for_later_1
[STOP] [2019-10-14 09:30:21] hold_for_later_1
[START] [2019-10-14 09:30:21] hold_for_later_2
[STOP] [2019-10-14 09:30:21] hold_for_later_2
[START] [2019-10-14 09:30:21] resolve_missing_parents
[STOP] [2019-10-14 09:30:43] resolve_missing_parents
[START] [2019-10-14 09:30:43] rebuild_nodes
[START] [2019-10-14 09:30:43] Flattener#flatten
[START] [2019-10-14 09:30:43] Flattener#study_resource
[START] [2019-10-14 09:30:43] Flattener#build_ancestry
[STOP] [2019-10-14 09:30:44] Flattener#build_ancestry
[INFO] [2019-10-14 09:30:44] 11699 ancestry keys
[START] [2019-10-14 09:30:44] build_node_ancestors
[INFO] [2019-10-14 09:30:44] old ancestors deleted.
[STOP] [2019-10-14 09:30:46] build_node_ancestors
[START] [2019-10-14 09:30:48] Flattener#propagate_ancestor_ids
[STOP] [2019-10-14 09:30:48] Flattener#propagate_ancestor_ids
[STOP] [2019-10-14 09:30:48] Flattener#flatten
[STOP] [2019-10-14 09:30:48] rebuild_nodes
[START] [2019-10-14 09:30:48] resolve_missing_media_owners
[STOP] [2019-10-14 09:30:48] resolve_missing_media_owners
[START] [2019-10-14 09:30:48] sanitize_media_verbatims
[STOP] [2019-10-14 09:30:48] sanitize_media_verbatims
[START] [2019-10-14 09:30:48] queue_downloads
[STOP] [2019-10-14 09:30:48] queue_downloads
[START] [2019-10-14 09:30:48] parse_names
[WARN] [2019-10-14 09:30:48] I see 11699 names which still need to be parsed.
[STOP] [2019-10-14 09:30:58] parse_names
[START] [2019-10-14 09:30:58] denormalize_canonical_names_to_nodes
[STOP] [2019-10-14 09:30:58] denormalize_canonical_names_to_nodes
[START] [2019-10-14 09:30:58] match_nodes
[START] [2019-10-14 09:30:58] map_all_nodes_to_pages
[STOP] [2019-10-14 09:41:43] map_all_nodes_to_pages
[INFO] [2019-10-14 09:41:43] 1205 Unmatched nodes (of 11699)! That's too many to output. First 10: Larus audouinii (#50791784); Larus melanocephalus (#50802595); Thalaseus (#50792077); Thalaseus sandvicensis (#50792076); Thalaseus bengalensis (#50794448); Thalaseus maximus (#50798096); Philomachus pugnax (#50792650); Limnodromus (#50797138); Erythropygia (#50792033); Erythropygia galactotes (#50792032)
[START] [2019-10-14 09:41:43] update_nodes
[STOP] [2019-10-14 09:41:47] update_nodes
[STOP] [2019-10-14 09:41:47] match_nodes
[START] [2019-10-14 09:41:47] reindex_search
[STOP] [2019-10-14 09:42:16] reindex_search
[START] [2019-10-14 09:42:16] normalize_units
[STOP] [2019-10-14 09:42:16] normalize_units
[START] [2019-10-14 09:42:16] calculate_statistics
[STOP] [2019-10-14 09:42:16] calculate_statistics
[START] [2019-10-14 09:42:17] complete_harvest_instance
[START] [2019-10-14 09:42:17] overall_tsv_creation
[INFO] [2019-10-14 09:42:17] Processing group of 11699 in 2 batches of 10000
[INFO] [2019-10-14 09:43:44] 6453 Traits (unfiltered)...
[INFO] [2019-10-14 09:43:57] 6453 Traits (filtered)...
[INFO] [2019-10-14 09:43:57] 0 Associations (filtered)...
[INFO] [2019-10-14 09:44:46] 32253 metadata added.
[INFO] [2019-10-14 09:44:46] 0 metadata added.
[INFO] [2019-10-14 09:45:39] 777 Traits (unfiltered)...
[INFO] [2019-10-14 09:45:53] 777 Traits (filtered)...
[INFO] [2019-10-14 09:45:53] 0 Associations (filtered)...
[INFO] [2019-10-14 09:46:31] 3883 metadata added.
[INFO] [2019-10-14 09:46:31] 0 metadata added.
[INFO] [2019-10-14 09:46:31] Average Time: 103.58
[INFO] [2019-10-14 09:46:31] Total Time: 4m15s
[STOP] [2019-10-14 09:46:31] overall_tsv_creation
[INFO] [2019-10-14 09:46:31] Done. Check your files:
[INFO] [2019-10-14 09:46:32] (11699 lines) /app/public/data/morocco_sp_list/publish_nodes.tsv
[INFO] [2019-10-14 09:46:32] (28601 lines) /app/public/data/morocco_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-14 09:46:32] (11699 lines) /app/public/data/morocco_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-14 09:46:32] (7231 lines) /app/public/data/morocco_sp_list/publish_traits.tsv
[INFO] [2019-10-14 09:46:32] (36137 lines) /app/public/data/morocco_sp_list/publish_metadata.tsv
[STOP] [2019-10-14 09:46:32] complete_harvest_instance
[START] [2019-10-14 09:46:32] completed
[STOP] [2019-10-14 09:46:32] completed
[STOP] [2019-10-14 09:46:32] logged process, took 1108.53
Latest Process