Stage:
completed
Fetched:
13 Oct 06:12
Validated:
13 Oct 06:12
Deltas Created
13 Oct 06:12
Units Normalized:
13 Oct 06:35
Ancestry Built:
13 Oct 06:16
Nodes Matched:
13 Oct 06:34
Names Parsed:
13 Oct 06:16
New Models Stored:
13 Oct 06:14
Indexed:
13 Oct 06:35
Completed:
13 Oct 06:40
Time to Harvest:
less than a minute
Harvesting Log
(156 lines)
# Logfile created on 2019-10-13 06:12:22 -0400 by logger.rb/56815
[START] [2019-10-13 06:12:22] logged process
[START] [2019-10-13 06:12:22] create_harvest_instance
[STOP] [2019-10-13 06:12:22] create_harvest_instance
[START] [2019-10-13 06:12:22] fetch_files
[STOP] [2019-10-13 06:12:22] fetch_files
[START] [2019-10-13 06:12:22] validate_each_file
[STOP] [2019-10-13 06:12:25] validate_each_file
[START] [2019-10-13 06:12:25] convert_to_csv
[CMD] [2019-10-13 06:12:25] /usr/bin/sort /app/public/converted_csv/guatemala_sp_lis_refs_15915.csv > /app/public/converted_csv/guatemala_sp_lis_refs_15915.csv_sorted
[CMD] [2019-10-13 06:12:25] /usr/bin/sort /app/public/converted_csv/guatemala_sp_lis_nodes_15916.csv > /app/public/converted_csv/guatemala_sp_lis_nodes_15916.csv_sorted
[CMD] [2019-10-13 06:12:25] /usr/bin/sort /app/public/converted_csv/guatemala_sp_lis_occurrences_15917.csv > /app/public/converted_csv/guatemala_sp_lis_occurrences_15917.csv_sorted
[CMD] [2019-10-13 06:12:25] /usr/bin/sort /app/public/converted_csv/guatemala_sp_lis_measurements_15918.csv > /app/public/converted_csv/guatemala_sp_lis_measurements_15918.csv_sorted
[STOP] [2019-10-13 06:12:25] convert_to_csv
[START] [2019-10-13 06:12:25] calculate_delta
[CMD] [2019-10-13 06:12:25] echo "0a" > /app/public/diff/guatemala_sp_lis_refs_15915.diff
[CMD] [2019-10-13 06:12:25] tail -n +1 /app/public/converted_csv/guatemala_sp_lis_refs_15915.csv >> /app/public/diff/guatemala_sp_lis_refs_15915.diff
[CMD] [2019-10-13 06:12:25] echo "." >> /app/public/diff/guatemala_sp_lis_refs_15915.diff
[CMD] [2019-10-13 06:12:25] echo "0a" > /app/public/diff/guatemala_sp_lis_nodes_15916.diff
[CMD] [2019-10-13 06:12:25] tail -n +1 /app/public/converted_csv/guatemala_sp_lis_nodes_15916.csv >> /app/public/diff/guatemala_sp_lis_nodes_15916.diff
[CMD] [2019-10-13 06:12:26] echo "." >> /app/public/diff/guatemala_sp_lis_nodes_15916.diff
[CMD] [2019-10-13 06:12:26] echo "0a" > /app/public/diff/guatemala_sp_lis_occurrences_15917.diff
[CMD] [2019-10-13 06:12:26] tail -n +1 /app/public/converted_csv/guatemala_sp_lis_occurrences_15917.csv >> /app/public/diff/guatemala_sp_lis_occurrences_15917.diff
[CMD] [2019-10-13 06:12:26] echo "." >> /app/public/diff/guatemala_sp_lis_occurrences_15917.diff
[CMD] [2019-10-13 06:12:26] echo "0a" > /app/public/diff/guatemala_sp_lis_measurements_15918.diff
[CMD] [2019-10-13 06:12:26] tail -n +1 /app/public/converted_csv/guatemala_sp_lis_measurements_15918.csv >> /app/public/diff/guatemala_sp_lis_measurements_15918.diff
[CMD] [2019-10-13 06:12:26] echo "." >> /app/public/diff/guatemala_sp_lis_measurements_15918.diff
[STOP] [2019-10-13 06:12:26] calculate_delta
[START] [2019-10-13 06:12:26] parse_diff_and_store
[INFO] [2019-10-13 06:12:26] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-13 06:12:26] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-13 06:12:33] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-13 06:12:36] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-13 06:13:52] Storing 2 References
[INFO] [2019-10-13 06:13:52] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-13 06:13:52] Average Time: 0.0
[INFO] [2019-10-13 06:13:52] Total Time: 1s
[INFO] [2019-10-13 06:13:52] Storing 18096 ScientificNames
[INFO] [2019-10-13 06:13:52] Processing group of 18096 in 19 groups of 1000
[INFO] [2019-10-13 06:13:59] Average Time: 0.379
[INFO] [2019-10-13 06:13:59] Total Time: 8s
[INFO] [2019-10-13 06:13:59] last 3 / first 3: 0.42
[INFO] [2019-10-13 06:13:59] Std.Dev: 0.17606816861659008; Max: 1.03
[INFO] [2019-10-13 06:13:59] Storing 18096 Nodes
[INFO] [2019-10-13 06:13:59] Processing group of 18096 in 19 groups of 1000
[INFO] [2019-10-13 06:14:05] Average Time: 0.323
[INFO] [2019-10-13 06:14:05] Total Time: 7s
[INFO] [2019-10-13 06:14:05] last 3 / first 3: 1.21
[INFO] [2019-10-13 06:14:05] Std.Dev: 0.10488088481701516; Max: 0.65
[INFO] [2019-10-13 06:14:05] Storing 13013 Occurrences
[INFO] [2019-10-13 06:14:05] Processing group of 13013 in 14 groups of 1000
[INFO] [2019-10-13 06:14:07] Average Time: 0.125
[INFO] [2019-10-13 06:14:07] Total Time: 2s
[INFO] [2019-10-13 06:14:07] last 3 / first 3: 0.38
[INFO] [2019-10-13 06:14:07] Std.Dev: 0.07071067811865475; Max: 0.3
[INFO] [2019-10-13 06:14:07] Storing 26254 TraitsReferences
[INFO] [2019-10-13 06:14:07] Processing group of 26254 in 27 groups of 1000
[INFO] [2019-10-13 06:14:09] Average Time: 0.074
[INFO] [2019-10-13 06:14:09] Total Time: 3s
[INFO] [2019-10-13 06:14:09] last 3 / first 3: 0.55
[INFO] [2019-10-13 06:14:09] Std.Dev: 0.0; Max: 0.15
[INFO] [2019-10-13 06:14:09] Storing 26253 Traits
[INFO] [2019-10-13 06:14:09] Processing group of 26253 in 27 groups of 1000
[INFO] [2019-10-13 06:14:18] Average Time: 0.338
[INFO] [2019-10-13 06:14:18] Total Time: 10s
[INFO] [2019-10-13 06:14:18] last 3 / first 3: 0.51
[INFO] [2019-10-13 06:14:18] Std.Dev: 0.11401754250991379; Max: 0.7
[INFO] [2019-10-13 06:14:18] Storing 26238 MetaTraits
[INFO] [2019-10-13 06:14:18] Processing group of 26238 in 27 groups of 1000
[INFO] [2019-10-13 06:14:22] Average Time: 0.126
[INFO] [2019-10-13 06:14:22] Total Time: 4s
[INFO] [2019-10-13 06:14:22] last 3 / first 3: 0.61
[INFO] [2019-10-13 06:14:22] Std.Dev: 0.03162277660168379; Max: 0.2
[STOP] [2019-10-13 06:14:22] parse_diff_and_store
[START] [2019-10-13 06:14:22] resolve_keys
[INFO] [2019-10-13 06:15:24] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-13 06:15:30] traits to occurrences...
[INFO] [2019-10-13 06:15:34] traits to nodes (through occurrences)...
[INFO] [2019-10-13 06:15:34] Traits to sex term...
[INFO] [2019-10-13 06:15:38] Traits to lifestage term...
[INFO] [2019-10-13 06:15:42] MetaTraits to traits...
[INFO] [2019-10-13 06:15:44] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-13 06:15:48] Assocs to occurrences...
[INFO] [2019-10-13 06:15:48] Assocs to nodes...
[INFO] [2019-10-13 06:15:48] Assoc to sex term...
[INFO] [2019-10-13 06:15:48] Assoc to lifestage term...
[STOP] [2019-10-13 06:15:48] resolve_keys
[START] [2019-10-13 06:15:48] hold_for_later_1
[STOP] [2019-10-13 06:15:48] hold_for_later_1
[START] [2019-10-13 06:15:48] hold_for_later_2
[STOP] [2019-10-13 06:15:48] hold_for_later_2
[START] [2019-10-13 06:15:48] resolve_missing_parents
[STOP] [2019-10-13 06:16:27] resolve_missing_parents
[START] [2019-10-13 06:16:27] rebuild_nodes
[START] [2019-10-13 06:16:27] Flattener#flatten
[START] [2019-10-13 06:16:27] Flattener#study_resource
[START] [2019-10-13 06:16:27] Flattener#build_ancestry
[STOP] [2019-10-13 06:16:29] Flattener#build_ancestry
[INFO] [2019-10-13 06:16:29] 18096 ancestry keys
[START] [2019-10-13 06:16:29] build_node_ancestors
[INFO] [2019-10-13 06:16:29] old ancestors deleted.
[STOP] [2019-10-13 06:16:32] build_node_ancestors
[START] [2019-10-13 06:16:36] Flattener#propagate_ancestor_ids
[STOP] [2019-10-13 06:16:37] Flattener#propagate_ancestor_ids
[STOP] [2019-10-13 06:16:37] Flattener#flatten
[STOP] [2019-10-13 06:16:37] rebuild_nodes
[START] [2019-10-13 06:16:37] resolve_missing_media_owners
[STOP] [2019-10-13 06:16:37] resolve_missing_media_owners
[START] [2019-10-13 06:16:37] sanitize_media_verbatims
[STOP] [2019-10-13 06:16:37] sanitize_media_verbatims
[START] [2019-10-13 06:16:37] queue_downloads
[STOP] [2019-10-13 06:16:37] queue_downloads
[START] [2019-10-13 06:16:37] parse_names
[WARN] [2019-10-13 06:16:37] I see 18096 names which still need to be parsed.
[STOP] [2019-10-13 06:16:52] parse_names
[START] [2019-10-13 06:16:52] denormalize_canonical_names_to_nodes
[STOP] [2019-10-13 06:16:52] denormalize_canonical_names_to_nodes
[START] [2019-10-13 06:16:52] match_nodes
[START] [2019-10-13 06:16:52] map_all_nodes_to_pages
[STOP] [2019-10-13 06:34:26] map_all_nodes_to_pages
[INFO] [2019-10-13 06:34:26] 1116 Unmatched nodes (of 18096)! That's too many to output. First 10: Molothrus oryzivora (#49956745); Attila flammulatus (#49954772); Pyrocephalus coronatus (#49951870); Thryothorus maculipectus (#49949195); Thryothorus modestus (#49952433); Thryothorus pleurostictus (#49958231); Sporophila funerea (#49948075); Geothlypis formosus (#49951768); Seiurus noveboracensis (#49952294); Seiurus aurocapillus (#49954143)
[START] [2019-10-13 06:34:26] update_nodes
[STOP] [2019-10-13 06:34:32] update_nodes
[STOP] [2019-10-13 06:34:32] match_nodes
[START] [2019-10-13 06:34:32] reindex_search
[STOP] [2019-10-13 06:35:18] reindex_search
[START] [2019-10-13 06:35:18] normalize_units
[STOP] [2019-10-13 06:35:18] normalize_units
[START] [2019-10-13 06:35:18] calculate_statistics
[STOP] [2019-10-13 06:35:18] calculate_statistics
[START] [2019-10-13 06:35:18] complete_harvest_instance
[START] [2019-10-13 06:35:18] overall_tsv_creation
[INFO] [2019-10-13 06:35:18] Processing group of 18096 in 2 batches of 10000
[INFO] [2019-10-13 06:36:47] 6599 Traits (unfiltered)...
[INFO] [2019-10-13 06:37:01] 6599 Traits (filtered)...
[INFO] [2019-10-13 06:37:01] 0 Associations (filtered)...
[INFO] [2019-10-13 06:37:52] 32989 metadata added.
[INFO] [2019-10-13 06:37:52] 0 metadata added.
[INFO] [2019-10-13 06:39:14] 6414 Traits (unfiltered)...
[INFO] [2019-10-13 06:39:28] 6414 Traits (filtered)...
[INFO] [2019-10-13 06:39:28] 0 Associations (filtered)...
[INFO] [2019-10-13 06:40:20] 32060 metadata added.
[INFO] [2019-10-13 06:40:20] 0 metadata added.
[INFO] [2019-10-13 06:40:20] Average Time: 125.64
[INFO] [2019-10-13 06:40:20] Total Time: 5m3s
[STOP] [2019-10-13 06:40:20] overall_tsv_creation
[INFO] [2019-10-13 06:40:20] Done. Check your files:
[INFO] [2019-10-13 06:40:20] (18096 lines) /app/public/data/guatemala_sp_lis/publish_nodes.tsv
[INFO] [2019-10-13 06:40:20] (47533 lines) /app/public/data/guatemala_sp_lis/publish_node_ancestors.tsv
[INFO] [2019-10-13 06:40:21] (18096 lines) /app/public/data/guatemala_sp_lis/publish_scientific_names.tsv
[INFO] [2019-10-13 06:40:21] (13014 lines) /app/public/data/guatemala_sp_lis/publish_traits.tsv
[INFO] [2019-10-13 06:40:21] (65050 lines) /app/public/data/guatemala_sp_lis/publish_metadata.tsv
[STOP] [2019-10-13 06:40:21] complete_harvest_instance
[START] [2019-10-13 06:40:21] completed
[STOP] [2019-10-13 06:40:21] completed
[STOP] [2019-10-13 06:40:21] logged process, took 1679.14
Latest Process