Harvest for
Lovegrove 2014
Created
13 Oct 12:48
Stage:
completed
Fetched:
13 Oct 12:48
Validated:
13 Oct 12:48
Deltas Created
13 Oct 12:49
Units Normalized:
13 Oct 12:49
Ancestry Built:
13 Oct 12:49
Nodes Matched:
13 Oct 12:49
Names Parsed:
13 Oct 12:49
New Models Stored:
13 Oct 12:49
Indexed:
13 Oct 12:49
Completed:
13 Oct 12:49
Time to Harvest:
less than a minute
Harvesting Log
(206 lines)
[INFO] [2023-10-13 12:48:59] Created harvest instance #4444
[STOP] [2023-10-13 12:48:59] create_harvest_instance
[START] [2023-10-13 12:48:59] fetch_files
[STOP] [2023-10-13 12:48:59] fetch_files
[START] [2023-10-13 12:48:59] validate_each_file
[INFO] [2023-10-13 12:48:59] Looping over 8 formats...
[INFO] [2023-10-13 12:48:59] ...agents (/app/public/data/lovegrove/agents.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_agents_30767.csv (0 lines)
[INFO] [2023-10-13 12:48:59] ...refs (/app/public/data/lovegrove/references.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_refs_30766.csv (6 lines)
[INFO] [2023-10-13 12:48:59] ...nodes (/app/public/data/lovegrove/taxa.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_nodes_30764.csv (83 lines)
[INFO] [2023-10-13 12:48:59] ...media (/app/public/data/lovegrove/media.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_media_30763.csv (0 lines)
[INFO] [2023-10-13 12:48:59] ...vernaculars (/app/public/data/lovegrove/common names.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_vernaculars_30765.csv (0 lines)
[INFO] [2023-10-13 12:48:59] ...occurrences (/app/public/data/lovegrove/occurrences.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_occurrences_30768.csv (83 lines)
[INFO] [2023-10-13 12:48:59] ...assocs (/app/public/data/lovegrove/associations.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_assocs_30770.csv (0 lines)
[INFO] [2023-10-13 12:48:59] ...measurements (/app/public/data/lovegrove/measurementorfact.txt)
[INFO] [2023-10-13 12:48:59] Valid: /app/public/data/lovegrove/converted_csv/lovegrove_measurements_30769.csv (810 lines)
[STOP] [2023-10-13 12:48:59] validate_each_file
[START] [2023-10-13 12:48:59] convert_to_csv
[INFO] [2023-10-13 12:48:59] Looping over 8 formats...
[INFO] [2023-10-13 12:48:59] ...agents (/app/public/data/lovegrove/agents.txt)
[CMD] [2023-10-13 12:48:59] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_agents_30767.csv > /app/public/data/lovegrove/converted_csv/lovegrove_agents_30767.csv_sorted
[INFO] [2023-10-13 12:48:59] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_agents_30767.csv (0 lines)
[INFO] [2023-10-13 12:48:59] ...refs (/app/public/data/lovegrove/references.txt)
[CMD] [2023-10-13 12:48:59] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_refs_30766.csv > /app/public/data/lovegrove/converted_csv/lovegrove_refs_30766.csv_sorted
[INFO] [2023-10-13 12:48:59] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_refs_30766.csv (6 lines)
[INFO] [2023-10-13 12:48:59] ...nodes (/app/public/data/lovegrove/taxa.txt)
[CMD] [2023-10-13 12:48:59] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_nodes_30764.csv > /app/public/data/lovegrove/converted_csv/lovegrove_nodes_30764.csv_sorted
[INFO] [2023-10-13 12:48:59] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_nodes_30764.csv (83 lines)
[INFO] [2023-10-13 12:48:59] ...media (/app/public/data/lovegrove/media.txt)
[CMD] [2023-10-13 12:48:59] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_media_30763.csv > /app/public/data/lovegrove/converted_csv/lovegrove_media_30763.csv_sorted
[INFO] [2023-10-13 12:48:59] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_media_30763.csv (0 lines)
[INFO] [2023-10-13 12:48:59] ...vernaculars (/app/public/data/lovegrove/common names.txt)
[CMD] [2023-10-13 12:48:59] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_vernaculars_30765.csv > /app/public/data/lovegrove/converted_csv/lovegrove_vernaculars_30765.csv_sorted
[INFO] [2023-10-13 12:48:59] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_vernaculars_30765.csv (0 lines)
[INFO] [2023-10-13 12:48:59] ...occurrences (/app/public/data/lovegrove/occurrences.txt)
[CMD] [2023-10-13 12:48:59] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_occurrences_30768.csv > /app/public/data/lovegrove/converted_csv/lovegrove_occurrences_30768.csv_sorted
[INFO] [2023-10-13 12:48:59] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_occurrences_30768.csv (83 lines)
[INFO] [2023-10-13 12:48:59] ...assocs (/app/public/data/lovegrove/associations.txt)
[CMD] [2023-10-13 12:48:59] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_assocs_30770.csv > /app/public/data/lovegrove/converted_csv/lovegrove_assocs_30770.csv_sorted
[INFO] [2023-10-13 12:49:00] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_assocs_30770.csv (0 lines)
[INFO] [2023-10-13 12:49:00] ...measurements (/app/public/data/lovegrove/measurementorfact.txt)
[CMD] [2023-10-13 12:49:00] /usr/bin/sort /app/public/data/lovegrove/converted_csv/lovegrove_measurements_30769.csv > /app/public/data/lovegrove/converted_csv/lovegrove_measurements_30769.csv_sorted
[INFO] [2023-10-13 12:49:00] Converted: /app/public/data/lovegrove/converted_csv/lovegrove_measurements_30769.csv (810 lines)
[STOP] [2023-10-13 12:49:00] convert_to_csv
[START] [2023-10-13 12:49:00] calculate_delta
[INFO] [2023-10-13 12:49:00] Looping over 8 formats...
[INFO] [2023-10-13 12:49:00] ...agents (/app/public/data/lovegrove/agents.txt)
[CMD] [2023-10-13 12:49:00] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_agents_30767.diff
[CMD] [2023-10-13 12:49:00] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_agents_30767.csv >> /app/public/data/lovegrove/diff/lovegrove_agents_30767.diff
[CMD] [2023-10-13 12:49:00] echo "." >> /app/public/data/lovegrove/diff/lovegrove_agents_30767.diff
[INFO] [2023-10-13 12:49:00] Created diff: /app/public/data/lovegrove/diff/lovegrove_agents_30767.diff (2 lines)
[INFO] [2023-10-13 12:49:00] ...refs (/app/public/data/lovegrove/references.txt)
[CMD] [2023-10-13 12:49:00] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_refs_30766.diff
[CMD] [2023-10-13 12:49:00] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_refs_30766.csv >> /app/public/data/lovegrove/diff/lovegrove_refs_30766.diff
[CMD] [2023-10-13 12:49:00] echo "." >> /app/public/data/lovegrove/diff/lovegrove_refs_30766.diff
[INFO] [2023-10-13 12:49:00] Created diff: /app/public/data/lovegrove/diff/lovegrove_refs_30766.diff (8 lines)
[INFO] [2023-10-13 12:49:00] ...nodes (/app/public/data/lovegrove/taxa.txt)
[CMD] [2023-10-13 12:49:00] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_nodes_30764.diff
[CMD] [2023-10-13 12:49:00] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_nodes_30764.csv >> /app/public/data/lovegrove/diff/lovegrove_nodes_30764.diff
[CMD] [2023-10-13 12:49:00] echo "." >> /app/public/data/lovegrove/diff/lovegrove_nodes_30764.diff
[INFO] [2023-10-13 12:49:00] Created diff: /app/public/data/lovegrove/diff/lovegrove_nodes_30764.diff (85 lines)
[INFO] [2023-10-13 12:49:00] ...media (/app/public/data/lovegrove/media.txt)
[CMD] [2023-10-13 12:49:00] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_media_30763.diff
[CMD] [2023-10-13 12:49:01] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_media_30763.csv >> /app/public/data/lovegrove/diff/lovegrove_media_30763.diff
[CMD] [2023-10-13 12:49:01] echo "." >> /app/public/data/lovegrove/diff/lovegrove_media_30763.diff
[INFO] [2023-10-13 12:49:01] Created diff: /app/public/data/lovegrove/diff/lovegrove_media_30763.diff (2 lines)
[INFO] [2023-10-13 12:49:01] ...vernaculars (/app/public/data/lovegrove/common names.txt)
[CMD] [2023-10-13 12:49:01] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_vernaculars_30765.diff
[CMD] [2023-10-13 12:49:01] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_vernaculars_30765.csv >> /app/public/data/lovegrove/diff/lovegrove_vernaculars_30765.diff
[CMD] [2023-10-13 12:49:01] echo "." >> /app/public/data/lovegrove/diff/lovegrove_vernaculars_30765.diff
[INFO] [2023-10-13 12:49:01] Created diff: /app/public/data/lovegrove/diff/lovegrove_vernaculars_30765.diff (2 lines)
[INFO] [2023-10-13 12:49:01] ...occurrences (/app/public/data/lovegrove/occurrences.txt)
[CMD] [2023-10-13 12:49:01] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_occurrences_30768.diff
[CMD] [2023-10-13 12:49:01] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_occurrences_30768.csv >> /app/public/data/lovegrove/diff/lovegrove_occurrences_30768.diff
[CMD] [2023-10-13 12:49:01] echo "." >> /app/public/data/lovegrove/diff/lovegrove_occurrences_30768.diff
[INFO] [2023-10-13 12:49:01] Created diff: /app/public/data/lovegrove/diff/lovegrove_occurrences_30768.diff (85 lines)
[INFO] [2023-10-13 12:49:01] ...assocs (/app/public/data/lovegrove/associations.txt)
[CMD] [2023-10-13 12:49:01] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_assocs_30770.diff
[CMD] [2023-10-13 12:49:01] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_assocs_30770.csv >> /app/public/data/lovegrove/diff/lovegrove_assocs_30770.diff
[CMD] [2023-10-13 12:49:01] echo "." >> /app/public/data/lovegrove/diff/lovegrove_assocs_30770.diff
[INFO] [2023-10-13 12:49:01] Created diff: /app/public/data/lovegrove/diff/lovegrove_assocs_30770.diff (2 lines)
[INFO] [2023-10-13 12:49:01] ...measurements (/app/public/data/lovegrove/measurementorfact.txt)
[CMD] [2023-10-13 12:49:01] echo "0a" > /app/public/data/lovegrove/diff/lovegrove_measurements_30769.diff
[CMD] [2023-10-13 12:49:01] tail -n +1 /app/public/data/lovegrove/converted_csv/lovegrove_measurements_30769.csv >> /app/public/data/lovegrove/diff/lovegrove_measurements_30769.diff
[CMD] [2023-10-13 12:49:01] echo "." >> /app/public/data/lovegrove/diff/lovegrove_measurements_30769.diff
[INFO] [2023-10-13 12:49:02] Created diff: /app/public/data/lovegrove/diff/lovegrove_measurements_30769.diff (812 lines)
[STOP] [2023-10-13 12:49:02] calculate_delta
[START] [2023-10-13 12:49:02] parse_diff_and_store
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_agents_30767.diff (2 lines)
[INFO] [2023-10-13 12:49:02] Loading agents diff file into memory (2 lines)...
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_refs_30766.diff (8 lines)
[INFO] [2023-10-13 12:49:02] Loading refs diff file into memory (8 lines)...
[INFO] [2023-10-13 12:49:02] Storing 6 References (6/6/8)
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_nodes_30764.diff (85 lines)
[INFO] [2023-10-13 12:49:02] Loading nodes diff file into memory (85 lines)...
[INFO] [2023-10-13 12:49:02] Storing 85 ScientificNames (170/83/85)
[INFO] [2023-10-13 12:49:02] Storing 85 Nodes (170/83/85)
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_media_30763.diff (2 lines)
[INFO] [2023-10-13 12:49:02] Loading media diff file into memory (2 lines)...
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_vernaculars_30765.diff (2 lines)
[INFO] [2023-10-13 12:49:02] Loading vernaculars diff file into memory (2 lines)...
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_occurrences_30768.diff (85 lines)
[INFO] [2023-10-13 12:49:02] Loading occurrences diff file into memory (85 lines)...
[INFO] [2023-10-13 12:49:02] Storing 83 Occurrences (83/83/85)
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_assocs_30770.diff (2 lines)
[INFO] [2023-10-13 12:49:02] Loading assocs diff file into memory (2 lines)...
[INFO] [2023-10-13 12:49:02] Handling diff: /app/public/data/lovegrove/diff/lovegrove_measurements_30769.diff (812 lines)
[INFO] [2023-10-13 12:49:02] Loading measurements diff file into memory (812 lines)...
[INFO] [2023-10-13 12:49:03] Storing 644 Traits (1450/810/812)
[INFO] [2023-10-13 12:49:03] Storing 324 TraitsReferences (1450/810/812)
[INFO] [2023-10-13 12:49:03] Storing 316 MetaTraits (1450/810/812)
[INFO] [2023-10-13 12:49:03] Storing 166 OccurrenceMetadata (1450/810/812)
[STOP] [2023-10-13 12:49:03] parse_diff_and_store
[START] [2023-10-13 12:49:03] resolve_keys
[2023-10-13 12:49:03] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2023-10-13 12:49:11] Occurrences to nodes (through scientific_names)...
[INFO] [2023-10-13 12:49:11] traits to occurrences...
[INFO] [2023-10-13 12:49:11] traits to nodes (through occurrences)...
[INFO] [2023-10-13 12:49:11] Traits to sex term...
[INFO] [2023-10-13 12:49:11] Traits to lifestage term...
[INFO] [2023-10-13 12:49:11] MetaTraits to traits...
[INFO] [2023-10-13 12:49:11] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2023-10-13 12:49:11] Assocs to occurrences...
[INFO] [2023-10-13 12:49:11] Assocs to nodes...
[INFO] [2023-10-13 12:49:11] Assoc to sex term...
[INFO] [2023-10-13 12:49:11] Assoc to lifestage term...
[INFO] [2023-10-13 12:49:11] MetaAssoc to assocs...
[STOP] [2023-10-13 12:49:11] resolve_keys
[START] [2023-10-13 12:49:11] hold_for_later_1
[STOP] [2023-10-13 12:49:11] hold_for_later_1
[START] [2023-10-13 12:49:11] hold_for_later_2
[STOP] [2023-10-13 12:49:11] hold_for_later_2
[START] [2023-10-13 12:49:11] resolve_missing_parents
[STOP] [2023-10-13 12:49:11] resolve_missing_parents
[START] [2023-10-13 12:49:11] rebuild_nodes
[START] [2023-10-13 12:49:11] Flattener#flatten
[START] [2023-10-13 12:49:11] Flattener#study_resource
[START] [2023-10-13 12:49:11] Flattener#build_ancestry
[STOP] [2023-10-13 12:49:11] Flattener#build_ancestry
[INFO] [2023-10-13 12:49:11] 85 ancestry keys
[START] [2023-10-13 12:49:11] build_node_ancestors
[INFO] [2023-10-13 12:49:11] old ancestors deleted.
[STOP] [2023-10-13 12:49:11] build_node_ancestors
[START] [2023-10-13 12:49:11] Flattener#propagate_ancestor_ids
[STOP] [2023-10-13 12:49:11] Flattener#propagate_ancestor_ids
[STOP] [2023-10-13 12:49:11] Flattener#flatten
[STOP] [2023-10-13 12:49:11] rebuild_nodes
[START] [2023-10-13 12:49:11] resolve_missing_media_owners
[STOP] [2023-10-13 12:49:11] resolve_missing_media_owners
[START] [2023-10-13 12:49:11] sanitize_media_verbatims
[STOP] [2023-10-13 12:49:11] sanitize_media_verbatims
[START] [2023-10-13 12:49:11] queue_downloads
[STOP] [2023-10-13 12:49:11] queue_downloads
[START] [2023-10-13 12:49:11] parse_names
[WARN] [2023-10-13 12:49:11] I see 85 names which still need to be parsed.
[WARN] [2023-10-13 12:49:11] Names to parse: 85 formatted: 85 learned: 85 parsed: 85
[STOP] [2023-10-13 12:49:12] parse_names
[START] [2023-10-13 12:49:12] denormalize_canonical_names_to_nodes
[STOP] [2023-10-13 12:49:12] denormalize_canonical_names_to_nodes
[START] [2023-10-13 12:49:12] match_nodes
[START] [2023-10-13 12:49:12] map_all_nodes_to_pages
[STOP] [2023-10-13 12:49:13] map_all_nodes_to_pages
[INFO] [2023-10-13 12:49:13] Unmatched nodes (2 of 85): Canonical: Crociduridae; Node#137164393; ResourceID: Crociduridae; Canonical: Scalopidae; Node#137164445; ResourceID: Scalopidae
[START] [2023-10-13 12:49:13] update_nodes
[STOP] [2023-10-13 12:49:13] update_nodes
[STOP] [2023-10-13 12:49:13] match_nodes
[START] [2023-10-13 12:49:13] reindex_search
[STOP] [2023-10-13 12:49:13] reindex_search
[START] [2023-10-13 12:49:13] normalize_units
[STOP] [2023-10-13 12:49:13] normalize_units
[START] [2023-10-13 12:49:13] calculate_statistics
[INFO] [2023-10-13 12:49:14] Duplicate page_id count: 2
[STOP] [2023-10-13 12:49:14] calculate_statistics
[START] [2023-10-13 12:49:14] complete_harvest_instance
[START] [2023-10-13 12:49:14] overall_tsv_creation
[INFO] [2023-10-13 12:49:14] Exporting 85 nodes as TSV in batches of 10000...
[INFO] [2023-10-13 12:49:14] Processing group of 85 in 1 batches of 10000
[INFO] [2023-10-13 12:49:14] 158 Traits (unfiltered) and 0 associations...
[INFO] [2023-10-13 12:49:14] Building Traits map for 85 nodes (this can take a while)...
[INFO] [2023-10-13 12:49:14] Mapped 158 traits (316 meta) for 85 nodes.
[INFO] [2023-10-13 12:49:14] Building Associations map (this can take a while)...
[INFO] [2023-10-13 12:49:14] Done. 0 assocs mapped (0 meta).
[INFO] [2023-10-13 12:49:14] Adding 158 traits...
[INFO] [2023-10-13 12:49:15] 794 metadata added.
[INFO] [2023-10-13 12:49:15] Adding 0 assocs...
[INFO] [2023-10-13 12:49:15] 0 metadata added.
[INFO] [2023-10-13 12:49:59] Processed 85/85 nodes
[INFO] [2023-10-13 12:49:59] Average Time: 44.38
[INFO] [2023-10-13 12:49:59] Total Time: 45s
[STOP] [2023-10-13 12:49:59] overall_tsv_creation
[INFO] [2023-10-13 12:49:59] Done. Check your files:
[INFO] [2023-10-13 12:49:59] (85 lines) /app/public/data/lovegrove/publish_nodes.tsv
[INFO] [2023-10-13 12:49:59] (5 lines) /app/public/data/lovegrove/publish_node_ancestors.tsv
[INFO] [2023-10-13 12:49:59] (85 lines) /app/public/data/lovegrove/publish_scientific_names.tsv
[INFO] [2023-10-13 12:49:59] (159 lines) /app/public/data/lovegrove/publish_traits.tsv
[INFO] [2023-10-13 12:49:59] (795 lines) /app/public/data/lovegrove/publish_metadata.tsv
[STOP] [2023-10-13 12:49:59] complete_harvest_instance
[START] [2023-10-13 12:49:59] completed
[STOP] [2023-10-13 12:49:59] completed
[STOP] [2023-10-13 12:49:59] logged process, took 60.37
Latest Process