Harvest for Aegean Sea Species List Created 01 Oct 15:53

Stage: completed
Fetched: 01 Oct 15:53
Validated: 01 Oct 15:53
Deltas Created 01 Oct 15:53
Units Normalized: 01 Oct 16:01
Ancestry Built: 01 Oct 15:55
Nodes Matched: 01 Oct 16:01
Names Parsed: 01 Oct 15:55
New Models Stored: 01 Oct 15:54
Indexed: 01 Oct 16:01
Completed: 01 Oct 16:03
Time to Harvest: less than a minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-10-01 15:53:33 -0400 by logger.rb/56815
[START] [2019-10-01 15:53:33] logged process
[START] [2019-10-01 15:53:33] create_harvest_instance
[STOP] [2019-10-01 15:53:33] create_harvest_instance
[START] [2019-10-01 15:53:33] fetch_files
[STOP] [2019-10-01 15:53:33] fetch_files
[START] [2019-10-01 15:53:33] validate_each_file
[STOP] [2019-10-01 15:53:34] validate_each_file
[START] [2019-10-01 15:53:34] convert_to_csv
[CMD] [2019-10-01 15:53:34] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_l2_refs_14726.csv > /app/public/converted_csv/aegean_sea_sp_l2_refs_14726.csv_sorted
[CMD] [2019-10-01 15:53:35] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_l2_nodes_14727.csv > /app/public/converted_csv/aegean_sea_sp_l2_nodes_14727.csv_sorted
[CMD] [2019-10-01 15:53:37] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_l2_occurrences_14728.csv > /app/public/converted_csv/aegean_sea_sp_l2_occurrences_14728.csv_sorted
[CMD] [2019-10-01 15:53:38] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_l2_measurements_14729.csv > /app/public/converted_csv/aegean_sea_sp_l2_measurements_14729.csv_sorted
[STOP] [2019-10-01 15:53:40] convert_to_csv
[START] [2019-10-01 15:53:40] calculate_delta
[CMD] [2019-10-01 15:53:40] echo "0a" > /app/public/diff/aegean_sea_sp_l2_refs_14726.diff
[CMD] [2019-10-01 15:53:41] tail -n +1 /app/public/converted_csv/aegean_sea_sp_l2_refs_14726.csv >> /app/public/diff/aegean_sea_sp_l2_refs_14726.diff
[CMD] [2019-10-01 15:53:43] echo "." >> /app/public/diff/aegean_sea_sp_l2_refs_14726.diff
[CMD] [2019-10-01 15:53:44] echo "0a" > /app/public/diff/aegean_sea_sp_l2_nodes_14727.diff
[CMD] [2019-10-01 15:53:46] tail -n +1 /app/public/converted_csv/aegean_sea_sp_l2_nodes_14727.csv >> /app/public/diff/aegean_sea_sp_l2_nodes_14727.diff
[CMD] [2019-10-01 15:53:48] echo "." >> /app/public/diff/aegean_sea_sp_l2_nodes_14727.diff
[CMD] [2019-10-01 15:53:49] echo "0a" > /app/public/diff/aegean_sea_sp_l2_occurrences_14728.diff
[CMD] [2019-10-01 15:53:51] tail -n +1 /app/public/converted_csv/aegean_sea_sp_l2_occurrences_14728.csv >> /app/public/diff/aegean_sea_sp_l2_occurrences_14728.diff
[CMD] [2019-10-01 15:53:52] echo "." >> /app/public/diff/aegean_sea_sp_l2_occurrences_14728.diff
[CMD] [2019-10-01 15:53:54] echo "0a" > /app/public/diff/aegean_sea_sp_l2_measurements_14729.diff
[CMD] [2019-10-01 15:53:55] tail -n +1 /app/public/converted_csv/aegean_sea_sp_l2_measurements_14729.csv >> /app/public/diff/aegean_sea_sp_l2_measurements_14729.diff
[CMD] [2019-10-01 15:53:57] echo "." >> /app/public/diff/aegean_sea_sp_l2_measurements_14729.diff
[STOP] [2019-10-01 15:53:58] calculate_delta
[START] [2019-10-01 15:53:58] parse_diff_and_store
[INFO] [2019-10-01 15:54:00] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-01 15:54:01] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-01 15:54:05] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-01 15:54:07] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-01 15:54:24] Storing 2 References
[INFO] [2019-10-01 15:54:24] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-01 15:54:24] Average Time: 0.0
[INFO] [2019-10-01 15:54:24] Total Time: 1s
[INFO] [2019-10-01 15:54:24] Storing 5937 ScientificNames
[INFO] [2019-10-01 15:54:24] Processing group of 5937 in 6 groups of 1000
[INFO] [2019-10-01 15:54:26] Average Time: 0.46
[INFO] [2019-10-01 15:54:26] Total Time: 3s
[INFO] [2019-10-01 15:54:26] Storing 5937 Nodes
[INFO] [2019-10-01 15:54:26] Processing group of 5937 in 6 groups of 1000
[INFO] [2019-10-01 15:54:28] Average Time: 0.295
[INFO] [2019-10-01 15:54:28] Total Time: 2s
[INFO] [2019-10-01 15:54:28] Storing 3104 Occurrences
[INFO] [2019-10-01 15:54:28] Processing group of 3104 in 4 groups of 1000
[INFO] [2019-10-01 15:54:29] Average Time: 0.078
[INFO] [2019-10-01 15:54:29] Total Time: 1s
[INFO] [2019-10-01 15:54:29] Storing 6208 TraitsReferences
[INFO] [2019-10-01 15:54:29] Processing group of 6208 in 7 groups of 1000
[INFO] [2019-10-01 15:54:29] Average Time: 0.076
[INFO] [2019-10-01 15:54:29] Total Time: 1s
[INFO] [2019-10-01 15:54:29] last 3 / first 3: 0.81
[INFO] [2019-10-01 15:54:29] Std.Dev: 0.044721359549995794; Max: 0.14
[INFO] [2019-10-01 15:54:29] Storing 6208 Traits
[INFO] [2019-10-01 15:54:29] Processing group of 6208 in 7 groups of 1000
[INFO] [2019-10-01 15:54:31] Average Time: 0.303
[INFO] [2019-10-01 15:54:31] Total Time: 3s
[INFO] [2019-10-01 15:54:31] last 3 / first 3: 0.62
[INFO] [2019-10-01 15:54:31] Std.Dev: 0.11832159566199232; Max: 0.44
[INFO] [2019-10-01 15:54:31] Storing 6207 MetaTraits
[INFO] [2019-10-01 15:54:31] Processing group of 6207 in 7 groups of 1000
[INFO] [2019-10-01 15:54:33] Average Time: 0.176
[INFO] [2019-10-01 15:54:33] Total Time: 2s
[INFO] [2019-10-01 15:54:33] last 3 / first 3: 0.38
[INFO] [2019-10-01 15:54:33] Std.Dev: 0.1341640786499874; Max: 0.46
[STOP] [2019-10-01 15:54:33] parse_diff_and_store
[START] [2019-10-01 15:54:33] resolve_keys
[INFO] [2019-10-01 15:54:59] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-01 15:55:02] traits to occurrences...
[INFO] [2019-10-01 15:55:03] traits to nodes (through occurrences)...
[INFO] [2019-10-01 15:55:03] Traits to sex term...
[INFO] [2019-10-01 15:55:04] Traits to lifestage term...
[INFO] [2019-10-01 15:55:06] MetaTraits to traits...
[INFO] [2019-10-01 15:55:06] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-01 15:55:07] Assocs to occurrences...
[INFO] [2019-10-01 15:55:07] Assocs to nodes...
[INFO] [2019-10-01 15:55:07] Assoc to sex term...
[INFO] [2019-10-01 15:55:07] Assoc to lifestage term...
[STOP] [2019-10-01 15:55:07] resolve_keys
[START] [2019-10-01 15:55:07] hold_for_later_1
[STOP] [2019-10-01 15:55:07] hold_for_later_1
[START] [2019-10-01 15:55:07] hold_for_later_2
[STOP] [2019-10-01 15:55:07] hold_for_later_2
[START] [2019-10-01 15:55:07] resolve_missing_parents
[STOP] [2019-10-01 15:55:20] resolve_missing_parents
[START] [2019-10-01 15:55:20] rebuild_nodes
[START] [2019-10-01 15:55:20] Flattener#flatten
[START] [2019-10-01 15:55:20] Flattener#study_resource
[START] [2019-10-01 15:55:20] Flattener#build_ancestry
[STOP] [2019-10-01 15:55:20] Flattener#build_ancestry
[INFO] [2019-10-01 15:55:20] 5937 ancestry keys
[START] [2019-10-01 15:55:20] build_node_ancestors
[INFO] [2019-10-01 15:55:20] old ancestors deleted.
[STOP] [2019-10-01 15:55:22] build_node_ancestors
[START] [2019-10-01 15:55:24] Flattener#propagate_ancestor_ids
[STOP] [2019-10-01 15:55:25] Flattener#propagate_ancestor_ids
[STOP] [2019-10-01 15:55:25] Flattener#flatten
[STOP] [2019-10-01 15:55:25] rebuild_nodes
[START] [2019-10-01 15:55:25] resolve_missing_media_owners
[STOP] [2019-10-01 15:55:25] resolve_missing_media_owners
[START] [2019-10-01 15:55:25] sanitize_media_verbatims
[STOP] [2019-10-01 15:55:25] sanitize_media_verbatims
[START] [2019-10-01 15:55:25] queue_downloads
[STOP] [2019-10-01 15:55:25] queue_downloads
[START] [2019-10-01 15:55:25] parse_names
[WARN] [2019-10-01 15:55:25] I see 5937 names which still need to be parsed.
[STOP] [2019-10-01 15:55:30] parse_names
[START] [2019-10-01 15:55:30] denormalize_canonical_names_to_nodes
[STOP] [2019-10-01 15:55:30] denormalize_canonical_names_to_nodes
[START] [2019-10-01 15:55:30] match_nodes
[START] [2019-10-01 15:55:30] map_all_nodes_to_pages
[STOP] [2019-10-01 16:01:14] map_all_nodes_to_pages
[INFO] [2019-10-01 16:01:14] 271 Unmatched nodes (of 5937)! That's too many to output. First 10: Granuloreticulosea (#47331033); Globigerinoides sacculifer (#47331183); Globigerinoides triloba (#47331746); Globigerinoides trilobus (#47331756); Globigerina digitata (#47331342); Globigerina calida (#47332611); Globoturborotalita tenella (#47331212); Globigerinella aequilateralis (#47331882); Globorotalia inflata (#47331213); Dentagloborotalia (#47333654)
[START] [2019-10-01 16:01:14] update_nodes
[STOP] [2019-10-01 16:01:16] update_nodes
[STOP] [2019-10-01 16:01:16] match_nodes
[START] [2019-10-01 16:01:16] reindex_search
[STOP] [2019-10-01 16:01:28] reindex_search
[START] [2019-10-01 16:01:28] normalize_units
[STOP] [2019-10-01 16:01:28] normalize_units
[START] [2019-10-01 16:01:28] calculate_statistics
[STOP] [2019-10-01 16:01:28] calculate_statistics
[START] [2019-10-01 16:01:28] complete_harvest_instance
[START] [2019-10-01 16:01:28] overall_tsv_creation
[INFO] [2019-10-01 16:01:28] Processing group of 5937 in 1 batches of 10000
[INFO] [2019-10-01 16:02:37] 3104 Traits (unfiltered)...
[INFO] [2019-10-01 16:02:50] 3104 Traits (filtered)...
[INFO] [2019-10-01 16:02:50] 0 Associations (filtered)...
[INFO] [2019-10-01 16:03:31] 15519 metadata added.
[INFO] [2019-10-01 16:03:31] 0 metadata added.
[INFO] [2019-10-01 16:03:31] Average Time: 98.03
[INFO] [2019-10-01 16:03:31] Total Time: 2m4s
[STOP] [2019-10-01 16:03:31] overall_tsv_creation
[INFO] [2019-10-01 16:03:31] Done. Check your files:
[INFO] [2019-10-01 16:03:33] (5937 lines) /app/public/data/aegean_sea_sp_l2/publish_nodes.tsv
[INFO] [2019-10-01 16:03:34] (30032 lines) /app/public/data/aegean_sea_sp_l2/publish_node_ancestors.tsv
[INFO] [2019-10-01 16:03:36] (5937 lines) /app/public/data/aegean_sea_sp_l2/publish_scientific_names.tsv
[INFO] [2019-10-01 16:03:37] (3105 lines) /app/public/data/aegean_sea_sp_l2/publish_traits.tsv
[INFO] [2019-10-01 16:03:39] (15520 lines) /app/public/data/aegean_sea_sp_l2/publish_metadata.tsv
[STOP] [2019-10-01 16:03:39] complete_harvest_instance
[START] [2019-10-01 16:03:39] completed
[STOP] [2019-10-01 16:03:39] completed
[STOP] [2019-10-01 16:03:39] logged process, took 606.2

Latest Process