Harvest for Copepod sizes Created 10 Jun 08:21

Stage: completed
Fetched: 10 Jun 08:21
Validated: 10 Jun 08:21
Deltas Created 10 Jun 08:22
Units Normalized: 10 Jun 08:22
Ancestry Built: 10 Jun 08:22
Nodes Matched: 10 Jun 08:22
Names Parsed: 10 Jun 08:22
New Models Stored: 10 Jun 08:22
Indexed: 10 Jun 08:22
Completed: 10 Jun 08:25
Time to Harvest: less than a minute

Harvesting Log

(217 lines)
[INFO] [2021-06-10 08:21:46] Created harvest instance #4010
[STOP] [2021-06-10 08:21:46] create_harvest_instance
[START] [2021-06-10 08:21:46] fetch_files
[STOP] [2021-06-10 08:21:46] fetch_files
[START] [2021-06-10 08:21:46] validate_each_file
[INFO] [2021-06-10 08:21:46] Looping over 8 formats...
[INFO] [2021-06-10 08:21:46] ...agents (/app/public/data/copepod_sizes_dw/agents.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_agents_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:46] ...refs (/app/public/data/copepod_sizes_dw/references.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_refs_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:46] ...nodes (/app/public/data/copepod_sizes_dw/taxa.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_nodes_4010.csv (68 lines)
[INFO] [2021-06-10 08:21:46] ...media (/app/public/data/copepod_sizes_dw/media.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_media_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:46] ...vernaculars (/app/public/data/copepod_sizes_dw/common names.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_vernaculars_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:46] ...occurrences (/app/public/data/copepod_sizes_dw/occurrences.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_occurrences_4010.csv (111 lines)
[INFO] [2021-06-10 08:21:46] ...assocs (/app/public/data/copepod_sizes_dw/associations.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_assocs_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:46] ...measurements (/app/public/data/copepod_sizes_dw/measurements or facts.txt)
[INFO] [2021-06-10 08:21:46] Valid: /app/public/converted_csv/copepod_sizes_dw_measurements_4010.csv (221 lines)
[STOP] [2021-06-10 08:21:46] validate_each_file
[START] [2021-06-10 08:21:46] convert_to_csv
[INFO] [2021-06-10 08:21:46] Looping over 8 formats...
[INFO] [2021-06-10 08:21:46] ...agents (/app/public/data/copepod_sizes_dw/agents.txt)
[CMD] [2021-06-10 08:21:46] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_agents_4010.csv > /app/public/converted_csv/copepod_sizes_dw_agents_4010.csv_sorted
[INFO] [2021-06-10 08:21:46] Converted: /app/public/converted_csv/copepod_sizes_dw_agents_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:46] ...refs (/app/public/data/copepod_sizes_dw/references.txt)
[CMD] [2021-06-10 08:21:46] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_refs_4010.csv > /app/public/converted_csv/copepod_sizes_dw_refs_4010.csv_sorted
[INFO] [2021-06-10 08:21:47] Converted: /app/public/converted_csv/copepod_sizes_dw_refs_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:47] ...nodes (/app/public/data/copepod_sizes_dw/taxa.txt)
[CMD] [2021-06-10 08:21:47] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_nodes_4010.csv > /app/public/converted_csv/copepod_sizes_dw_nodes_4010.csv_sorted
[INFO] [2021-06-10 08:21:47] Converted: /app/public/converted_csv/copepod_sizes_dw_nodes_4010.csv (68 lines)
[INFO] [2021-06-10 08:21:47] ...media (/app/public/data/copepod_sizes_dw/media.txt)
[CMD] [2021-06-10 08:21:47] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_media_4010.csv > /app/public/converted_csv/copepod_sizes_dw_media_4010.csv_sorted
[INFO] [2021-06-10 08:21:48] Converted: /app/public/converted_csv/copepod_sizes_dw_media_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:48] ...vernaculars (/app/public/data/copepod_sizes_dw/common names.txt)
[CMD] [2021-06-10 08:21:48] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_vernaculars_4010.csv > /app/public/converted_csv/copepod_sizes_dw_vernaculars_4010.csv_sorted
[INFO] [2021-06-10 08:21:48] Converted: /app/public/converted_csv/copepod_sizes_dw_vernaculars_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:48] ...occurrences (/app/public/data/copepod_sizes_dw/occurrences.txt)
[CMD] [2021-06-10 08:21:48] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_occurrences_4010.csv > /app/public/converted_csv/copepod_sizes_dw_occurrences_4010.csv_sorted
[INFO] [2021-06-10 08:21:49] Converted: /app/public/converted_csv/copepod_sizes_dw_occurrences_4010.csv (111 lines)
[INFO] [2021-06-10 08:21:49] ...assocs (/app/public/data/copepod_sizes_dw/associations.txt)
[CMD] [2021-06-10 08:21:49] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_assocs_4010.csv > /app/public/converted_csv/copepod_sizes_dw_assocs_4010.csv_sorted
[INFO] [2021-06-10 08:21:49] Converted: /app/public/converted_csv/copepod_sizes_dw_assocs_4010.csv (0 lines)
[INFO] [2021-06-10 08:21:49] ...measurements (/app/public/data/copepod_sizes_dw/measurements or facts.txt)
[CMD] [2021-06-10 08:21:49] /usr/bin/sort /app/public/converted_csv/copepod_sizes_dw_measurements_4010.csv > /app/public/converted_csv/copepod_sizes_dw_measurements_4010.csv_sorted
[INFO] [2021-06-10 08:21:50] Converted: /app/public/converted_csv/copepod_sizes_dw_measurements_4010.csv (221 lines)
[STOP] [2021-06-10 08:21:50] convert_to_csv
[START] [2021-06-10 08:21:50] calculate_delta
[INFO] [2021-06-10 08:21:50] Looping over 8 formats...
[INFO] [2021-06-10 08:21:50] ...agents (/app/public/data/copepod_sizes_dw/agents.txt)
[CMD] [2021-06-10 08:21:50] echo "0a" > /app/public/diff/copepod_sizes_dw_agents_4010.diff
[CMD] [2021-06-10 08:21:50] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_agents_4010.csv >> /app/public/diff/copepod_sizes_dw_agents_4010.diff
[CMD] [2021-06-10 08:21:51] echo "." >> /app/public/diff/copepod_sizes_dw_agents_4010.diff
[INFO] [2021-06-10 08:21:51] Created diff: /app/public/diff/copepod_sizes_dw_agents_4010.diff (2 lines)
[INFO] [2021-06-10 08:21:51] ...refs (/app/public/data/copepod_sizes_dw/references.txt)
[CMD] [2021-06-10 08:21:51] echo "0a" > /app/public/diff/copepod_sizes_dw_refs_4010.diff
[CMD] [2021-06-10 08:21:52] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_refs_4010.csv >> /app/public/diff/copepod_sizes_dw_refs_4010.diff
[CMD] [2021-06-10 08:21:52] echo "." >> /app/public/diff/copepod_sizes_dw_refs_4010.diff
[INFO] [2021-06-10 08:21:52] Created diff: /app/public/diff/copepod_sizes_dw_refs_4010.diff (2 lines)
[INFO] [2021-06-10 08:21:52] ...nodes (/app/public/data/copepod_sizes_dw/taxa.txt)
[CMD] [2021-06-10 08:21:52] echo "0a" > /app/public/diff/copepod_sizes_dw_nodes_4010.diff
[CMD] [2021-06-10 08:21:53] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_nodes_4010.csv >> /app/public/diff/copepod_sizes_dw_nodes_4010.diff
[CMD] [2021-06-10 08:21:53] echo "." >> /app/public/diff/copepod_sizes_dw_nodes_4010.diff
[INFO] [2021-06-10 08:21:54] Created diff: /app/public/diff/copepod_sizes_dw_nodes_4010.diff (70 lines)
[INFO] [2021-06-10 08:21:54] ...media (/app/public/data/copepod_sizes_dw/media.txt)
[CMD] [2021-06-10 08:21:54] echo "0a" > /app/public/diff/copepod_sizes_dw_media_4010.diff
[CMD] [2021-06-10 08:21:54] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_media_4010.csv >> /app/public/diff/copepod_sizes_dw_media_4010.diff
[CMD] [2021-06-10 08:21:55] echo "." >> /app/public/diff/copepod_sizes_dw_media_4010.diff
[INFO] [2021-06-10 08:21:55] Created diff: /app/public/diff/copepod_sizes_dw_media_4010.diff (2 lines)
[INFO] [2021-06-10 08:21:55] ...vernaculars (/app/public/data/copepod_sizes_dw/common names.txt)
[CMD] [2021-06-10 08:21:55] echo "0a" > /app/public/diff/copepod_sizes_dw_vernaculars_4010.diff
[CMD] [2021-06-10 08:21:56] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_vernaculars_4010.csv >> /app/public/diff/copepod_sizes_dw_vernaculars_4010.diff
[CMD] [2021-06-10 08:21:56] echo "." >> /app/public/diff/copepod_sizes_dw_vernaculars_4010.diff
[INFO] [2021-06-10 08:21:57] Created diff: /app/public/diff/copepod_sizes_dw_vernaculars_4010.diff (2 lines)
[INFO] [2021-06-10 08:21:57] ...occurrences (/app/public/data/copepod_sizes_dw/occurrences.txt)
[CMD] [2021-06-10 08:21:57] echo "0a" > /app/public/diff/copepod_sizes_dw_occurrences_4010.diff
[CMD] [2021-06-10 08:21:57] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_occurrences_4010.csv >> /app/public/diff/copepod_sizes_dw_occurrences_4010.diff
[CMD] [2021-06-10 08:21:58] echo "." >> /app/public/diff/copepod_sizes_dw_occurrences_4010.diff
[INFO] [2021-06-10 08:21:58] Created diff: /app/public/diff/copepod_sizes_dw_occurrences_4010.diff (113 lines)
[INFO] [2021-06-10 08:21:58] ...assocs (/app/public/data/copepod_sizes_dw/associations.txt)
[CMD] [2021-06-10 08:21:58] echo "0a" > /app/public/diff/copepod_sizes_dw_assocs_4010.diff
[CMD] [2021-06-10 08:21:58] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_assocs_4010.csv >> /app/public/diff/copepod_sizes_dw_assocs_4010.diff
[CMD] [2021-06-10 08:21:59] echo "." >> /app/public/diff/copepod_sizes_dw_assocs_4010.diff
[INFO] [2021-06-10 08:21:59] Created diff: /app/public/diff/copepod_sizes_dw_assocs_4010.diff (2 lines)
[INFO] [2021-06-10 08:21:59] ...measurements (/app/public/data/copepod_sizes_dw/measurements or facts.txt)
[CMD] [2021-06-10 08:21:59] echo "0a" > /app/public/diff/copepod_sizes_dw_measurements_4010.diff
[CMD] [2021-06-10 08:22:00] tail -n +1 /app/public/converted_csv/copepod_sizes_dw_measurements_4010.csv >> /app/public/diff/copepod_sizes_dw_measurements_4010.diff
[CMD] [2021-06-10 08:22:00] echo "." >> /app/public/diff/copepod_sizes_dw_measurements_4010.diff
[INFO] [2021-06-10 08:22:01] Created diff: /app/public/diff/copepod_sizes_dw_measurements_4010.diff (223 lines)
[STOP] [2021-06-10 08:22:01] calculate_delta
[START] [2021-06-10 08:22:01] parse_diff_and_store
[INFO] [2021-06-10 08:22:01] Handling diff: /app/public/diff/copepod_sizes_dw_agents_4010.diff (2 lines)
[INFO] [2021-06-10 08:22:01] Loading agents diff file into memory (2 /app/public/diff/copepod_sizes_dw_agents_4010.diff lines)...
[INFO] [2021-06-10 08:22:02] Handling diff: /app/public/diff/copepod_sizes_dw_refs_4010.diff (2 lines)
[INFO] [2021-06-10 08:22:02] Loading refs diff file into memory (2 /app/public/diff/copepod_sizes_dw_refs_4010.diff lines)...
[INFO] [2021-06-10 08:22:03] Handling diff: /app/public/diff/copepod_sizes_dw_nodes_4010.diff (70 lines)
[INFO] [2021-06-10 08:22:03] Loading nodes diff file into memory (70 /app/public/diff/copepod_sizes_dw_nodes_4010.diff lines)...
[INFO] [2021-06-10 08:22:04] Handling diff: /app/public/diff/copepod_sizes_dw_media_4010.diff (2 lines)
[INFO] [2021-06-10 08:22:04] Loading media diff file into memory (2 /app/public/diff/copepod_sizes_dw_media_4010.diff lines)...
[INFO] [2021-06-10 08:22:05] Handling diff: /app/public/diff/copepod_sizes_dw_vernaculars_4010.diff (2 lines)
[INFO] [2021-06-10 08:22:05] Loading vernaculars diff file into memory (2 /app/public/diff/copepod_sizes_dw_vernaculars_4010.diff lines)...
[INFO] [2021-06-10 08:22:06] Handling diff: /app/public/diff/copepod_sizes_dw_occurrences_4010.diff (113 lines)
[INFO] [2021-06-10 08:22:06] Loading occurrences diff file into memory (113 /app/public/diff/copepod_sizes_dw_occurrences_4010.diff lines)...
[INFO] [2021-06-10 08:22:07] Handling diff: /app/public/diff/copepod_sizes_dw_assocs_4010.diff (2 lines)
[INFO] [2021-06-10 08:22:07] Loading assocs diff file into memory (2 /app/public/diff/copepod_sizes_dw_assocs_4010.diff lines)...
[INFO] [2021-06-10 08:22:08] Handling diff: /app/public/diff/copepod_sizes_dw_measurements_4010.diff (223 lines)
[INFO] [2021-06-10 08:22:08] Loading measurements diff file into memory (223 /app/public/diff/copepod_sizes_dw_measurements_4010.diff lines)...
[INFO] [2021-06-10 08:22:09] Storing 71 ScientificNames
[INFO] [2021-06-10 08:22:09] Processing group of 71 in 1 groups of 1000
[INFO] [2021-06-10 08:22:09] Average Time: 0.03
[INFO] [2021-06-10 08:22:09] Total Time: 1s
[INFO] [2021-06-10 08:22:09] Storing 71 Nodes
[INFO] [2021-06-10 08:22:09] Processing group of 71 in 1 groups of 1000
[INFO] [2021-06-10 08:22:09] Average Time: 0.03
[INFO] [2021-06-10 08:22:09] Total Time: 1s
[INFO] [2021-06-10 08:22:09] Storing 111 Occurrences
[INFO] [2021-06-10 08:22:09] Processing group of 111 in 1 groups of 1000
[INFO] [2021-06-10 08:22:09] Average Time: 0.04
[INFO] [2021-06-10 08:22:09] Total Time: 1s
[INFO] [2021-06-10 08:22:09] Storing 186 OccurrenceMetadata
[INFO] [2021-06-10 08:22:09] Processing group of 186 in 1 groups of 1000
[INFO] [2021-06-10 08:22:09] Average Time: 0.09
[INFO] [2021-06-10 08:22:09] Total Time: 1s
[INFO] [2021-06-10 08:22:09] Storing 221 Traits
[INFO] [2021-06-10 08:22:09] Processing group of 221 in 1 groups of 1000
[INFO] [2021-06-10 08:22:10] Average Time: 0.68
[INFO] [2021-06-10 08:22:10] Total Time: 1s
[INFO] [2021-06-10 08:22:10] Storing 221 MetaTraits
[INFO] [2021-06-10 08:22:10] Processing group of 221 in 1 groups of 1000
[INFO] [2021-06-10 08:22:10] Average Time: 0.05
[INFO] [2021-06-10 08:22:10] Total Time: 1s
[STOP] [2021-06-10 08:22:10] parse_diff_and_store
[START] [2021-06-10 08:22:10] resolve_keys
[INFO] [2021-06-10 08:22:17] Occurrences to nodes (through scientific_names)...
[INFO] [2021-06-10 08:22:18] traits to occurrences...
[INFO] [2021-06-10 08:22:18] traits to nodes (through occurrences)...
[INFO] [2021-06-10 08:22:18] Traits to sex term...
[INFO] [2021-06-10 08:22:18] Traits to lifestage term...
[INFO] [2021-06-10 08:22:18] MetaTraits to traits...
[INFO] [2021-06-10 08:22:18] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-06-10 08:22:18] Assocs to occurrences...
[INFO] [2021-06-10 08:22:18] Assocs to nodes...
[INFO] [2021-06-10 08:22:18] Assoc to sex term...
[INFO] [2021-06-10 08:22:18] Assoc to lifestage term...
[INFO] [2021-06-10 08:22:18] MetaAssoc to assocs...
[STOP] [2021-06-10 08:22:18] resolve_keys
[START] [2021-06-10 08:22:18] hold_for_later_1
[STOP] [2021-06-10 08:22:18] hold_for_later_1
[START] [2021-06-10 08:22:18] hold_for_later_2
[STOP] [2021-06-10 08:22:18] hold_for_later_2
[START] [2021-06-10 08:22:18] resolve_missing_parents
[STOP] [2021-06-10 08:22:18] resolve_missing_parents
[START] [2021-06-10 08:22:18] rebuild_nodes
[START] [2021-06-10 08:22:18] Flattener#flatten
[START] [2021-06-10 08:22:18] Flattener#study_resource
[START] [2021-06-10 08:22:18] Flattener#build_ancestry
[STOP] [2021-06-10 08:22:18] Flattener#build_ancestry
[INFO] [2021-06-10 08:22:18] 71 ancestry keys
[START] [2021-06-10 08:22:18] build_node_ancestors
[INFO] [2021-06-10 08:22:18] old ancestors deleted.
[STOP] [2021-06-10 08:22:18] build_node_ancestors
[START] [2021-06-10 08:22:18] Flattener#propagate_ancestor_ids
[STOP] [2021-06-10 08:22:18] Flattener#propagate_ancestor_ids
[STOP] [2021-06-10 08:22:18] Flattener#flatten
[STOP] [2021-06-10 08:22:18] rebuild_nodes
[START] [2021-06-10 08:22:18] resolve_missing_media_owners
[STOP] [2021-06-10 08:22:18] resolve_missing_media_owners
[START] [2021-06-10 08:22:18] sanitize_media_verbatims
[STOP] [2021-06-10 08:22:18] sanitize_media_verbatims
[START] [2021-06-10 08:22:18] queue_downloads
[STOP] [2021-06-10 08:22:18] queue_downloads
[START] [2021-06-10 08:22:18] parse_names
[WARN] [2021-06-10 08:22:18] I see 71 names which still need to be parsed.
[STOP] [2021-06-10 08:22:19] parse_names
[START] [2021-06-10 08:22:19] denormalize_canonical_names_to_nodes
[STOP] [2021-06-10 08:22:19] denormalize_canonical_names_to_nodes
[START] [2021-06-10 08:22:19] match_nodes
[START] [2021-06-10 08:22:19] map_all_nodes_to_pages
[STOP] [2021-06-10 08:22:25] map_all_nodes_to_pages
[INFO] [2021-06-10 08:22:25] Unmatched nodes (2 of 71): Canonical: Maxillopoda; Node#95890021; ResourceID: Maxillopoda; Canonical: Paracalanus parvus; Node#95890073; ResourceID: Paracalanus parvus
[START] [2021-06-10 08:22:25] update_nodes
[STOP] [2021-06-10 08:22:25] update_nodes
[STOP] [2021-06-10 08:22:25] match_nodes
[START] [2021-06-10 08:22:25] reindex_search
[STOP] [2021-06-10 08:22:26] reindex_search
[START] [2021-06-10 08:22:26] normalize_units
[STOP] [2021-06-10 08:22:26] normalize_units
[START] [2021-06-10 08:22:26] calculate_statistics
[2021-06-10 08:22:26] (NEAR) DUPLICATE TRAITS FOUND! There are only 219 (of 221 total) unique traits.
[2021-06-10 08:22:26] (Near) duplicate trait pairs (up to 100):
[2021-06-10 08:22:26] (resource_pk: 26, id: 219349955), (resource_pk: 6, id: 219349992)
[2021-06-10 08:22:26] (resource_pk: 39, id: 219349969), (resource_pk: 40, id: 219349971)
[STOP] [2021-06-10 08:22:26] calculate_statistics
[START] [2021-06-10 08:22:26] complete_harvest_instance
[START] [2021-06-10 08:22:26] overall_tsv_creation
[INFO] [2021-06-10 08:22:26] Processing group of 71 in 1 batches of 10000
[INFO] [2021-06-10 08:24:30] 221 Traits (unfiltered)...
[INFO] [2021-06-10 08:25:06] 221 Traits (filtered)...
[INFO] [2021-06-10 08:25:06] 0 Associations (filtered)...
[INFO] [2021-06-10 08:25:06] 38 metadata added.
[INFO] [2021-06-10 08:25:06] 0 metadata added.
[INFO] [2021-06-10 08:25:34] Average Time: 84.01
[INFO] [2021-06-10 08:25:34] Total Time: 3m8s
[STOP] [2021-06-10 08:25:34] overall_tsv_creation
[INFO] [2021-06-10 08:25:34] Done. Check your files:
[INFO] [2021-06-10 08:25:35] (71 lines) /app/public/data/copepod_sizes_dw/publish_nodes.tsv
[INFO] [2021-06-10 08:25:35] (207 lines) /app/public/data/copepod_sizes_dw/publish_node_ancestors.tsv
[INFO] [2021-06-10 08:25:36] (71 lines) /app/public/data/copepod_sizes_dw/publish_scientific_names.tsv
[INFO] [2021-06-10 08:25:36] (222 lines) /app/public/data/copepod_sizes_dw/publish_traits.tsv
[INFO] [2021-06-10 08:25:37] (39 lines) /app/public/data/copepod_sizes_dw/publish_metadata.tsv
[STOP] [2021-06-10 08:25:37] complete_harvest_instance
[START] [2021-06-10 08:25:37] completed
[STOP] [2021-06-10 08:25:37] completed
[STOP] [2021-06-10 08:25:37] logged process, took 231.32

Latest Process