Harvest for McFadden et al 2010 Created 03 Aug 17:00

Stage: completed
Fetched: 03 Aug 17:00
Validated: 03 Aug 17:00
Deltas Created 03 Aug 17:00
Units Normalized: 03 Aug 17:00
Ancestry Built: 03 Aug 17:00
Nodes Matched: 03 Aug 17:00
Names Parsed: 03 Aug 17:00
New Models Stored: 03 Aug 17:00
Indexed: 03 Aug 17:00
Completed: 03 Aug 17:02
Time to Harvest: less than a minute

Harvesting Log

(143 lines)
[INFO] [2022-08-03 17:00:03] Created harvest instance #4185
[STOP] [2022-08-03 17:00:03] create_harvest_instance
[START] [2022-08-03 17:00:03] fetch_files
[STOP] [2022-08-03 17:00:03] fetch_files
[START] [2022-08-03 17:00:03] validate_each_file
[INFO] [2022-08-03 17:00:03] Looping over 3 formats...
[INFO] [2022-08-03 17:00:03] ...nodes (/app/public/data/mfeam/taxa.txt)
[INFO] [2022-08-03 17:00:03] Valid: /app/public/data/mfeam/converted_csv/mfeam_nodes_29621.csv (3 lines)
[INFO] [2022-08-03 17:00:03] ...occurrences (/app/public/data/mfeam/occurrences.txt)
[INFO] [2022-08-03 17:00:03] Valid: /app/public/data/mfeam/converted_csv/mfeam_occurrences_29622.csv (3 lines)
[INFO] [2022-08-03 17:00:03] ...measurements (/app/public/data/mfeam/measurementOrFact.txt)
[INFO] [2022-08-03 17:00:03] Valid: /app/public/data/mfeam/converted_csv/mfeam_measurements_29623.csv (7 lines)
[STOP] [2022-08-03 17:00:03] validate_each_file
[START] [2022-08-03 17:00:03] convert_to_csv
[INFO] [2022-08-03 17:00:03] Looping over 3 formats...
[INFO] [2022-08-03 17:00:03] ...nodes (/app/public/data/mfeam/taxa.txt)
[CMD] [2022-08-03 17:00:03] /usr/bin/sort /app/public/data/mfeam/converted_csv/mfeam_nodes_29621.csv > /app/public/data/mfeam/converted_csv/mfeam_nodes_29621.csv_sorted
[INFO] [2022-08-03 17:00:03] Converted: /app/public/data/mfeam/converted_csv/mfeam_nodes_29621.csv (3 lines)
[INFO] [2022-08-03 17:00:03] ...occurrences (/app/public/data/mfeam/occurrences.txt)
[CMD] [2022-08-03 17:00:03] /usr/bin/sort /app/public/data/mfeam/converted_csv/mfeam_occurrences_29622.csv > /app/public/data/mfeam/converted_csv/mfeam_occurrences_29622.csv_sorted
[INFO] [2022-08-03 17:00:03] Converted: /app/public/data/mfeam/converted_csv/mfeam_occurrences_29622.csv (3 lines)
[INFO] [2022-08-03 17:00:03] ...measurements (/app/public/data/mfeam/measurementOrFact.txt)
[CMD] [2022-08-03 17:00:03] /usr/bin/sort /app/public/data/mfeam/converted_csv/mfeam_measurements_29623.csv > /app/public/data/mfeam/converted_csv/mfeam_measurements_29623.csv_sorted
[INFO] [2022-08-03 17:00:03] Converted: /app/public/data/mfeam/converted_csv/mfeam_measurements_29623.csv (7 lines)
[STOP] [2022-08-03 17:00:03] convert_to_csv
[START] [2022-08-03 17:00:03] calculate_delta
[INFO] [2022-08-03 17:00:03] Looping over 3 formats...
[INFO] [2022-08-03 17:00:03] ...nodes (/app/public/data/mfeam/taxa.txt)
[CMD] [2022-08-03 17:00:03] echo "0a" > /app/public/data/mfeam/diff/mfeam_nodes_29621.diff
[CMD] [2022-08-03 17:00:03] tail -n +1 /app/public/data/mfeam/converted_csv/mfeam_nodes_29621.csv >> /app/public/data/mfeam/diff/mfeam_nodes_29621.diff
[CMD] [2022-08-03 17:00:03] echo "." >> /app/public/data/mfeam/diff/mfeam_nodes_29621.diff
[INFO] [2022-08-03 17:00:03] Created diff: /app/public/data/mfeam/diff/mfeam_nodes_29621.diff (5 lines)
[INFO] [2022-08-03 17:00:03] ...occurrences (/app/public/data/mfeam/occurrences.txt)
[CMD] [2022-08-03 17:00:03] echo "0a" > /app/public/data/mfeam/diff/mfeam_occurrences_29622.diff
[CMD] [2022-08-03 17:00:03] tail -n +1 /app/public/data/mfeam/converted_csv/mfeam_occurrences_29622.csv >> /app/public/data/mfeam/diff/mfeam_occurrences_29622.diff
[CMD] [2022-08-03 17:00:03] echo "." >> /app/public/data/mfeam/diff/mfeam_occurrences_29622.diff
[INFO] [2022-08-03 17:00:03] Created diff: /app/public/data/mfeam/diff/mfeam_occurrences_29622.diff (5 lines)
[INFO] [2022-08-03 17:00:03] ...measurements (/app/public/data/mfeam/measurementOrFact.txt)
[CMD] [2022-08-03 17:00:03] echo "0a" > /app/public/data/mfeam/diff/mfeam_measurements_29623.diff
[CMD] [2022-08-03 17:00:03] tail -n +1 /app/public/data/mfeam/converted_csv/mfeam_measurements_29623.csv >> /app/public/data/mfeam/diff/mfeam_measurements_29623.diff
[CMD] [2022-08-03 17:00:03] echo "." >> /app/public/data/mfeam/diff/mfeam_measurements_29623.diff
[INFO] [2022-08-03 17:00:03] Created diff: /app/public/data/mfeam/diff/mfeam_measurements_29623.diff (9 lines)
[STOP] [2022-08-03 17:00:03] calculate_delta
[START] [2022-08-03 17:00:03] parse_diff_and_store
[INFO] [2022-08-03 17:00:03] Handling diff: /app/public/data/mfeam/diff/mfeam_nodes_29621.diff (5 lines)
[INFO] [2022-08-03 17:00:03] Loading nodes diff file into memory (5 lines)...
[INFO] [2022-08-03 17:00:03] Storing 3 ScientificNames (6/3/5)
[INFO] [2022-08-03 17:00:03] Storing 3 Nodes (6/3/5)
[INFO] [2022-08-03 17:00:03] Handling diff: /app/public/data/mfeam/diff/mfeam_occurrences_29622.diff (5 lines)
[INFO] [2022-08-03 17:00:03] Loading occurrences diff file into memory (5 lines)...
[INFO] [2022-08-03 17:00:03] Storing 3 Occurrences (4/3/5)
[INFO] [2022-08-03 17:00:03] Storing 1 OccurrenceMetadata (4/3/5)
[INFO] [2022-08-03 17:00:03] Handling diff: /app/public/data/mfeam/diff/mfeam_measurements_29623.diff (9 lines)
[INFO] [2022-08-03 17:00:03] Loading measurements diff file into memory (9 lines)...
[INFO] [2022-08-03 17:00:03] Storing 7 Traits (11/7/9)
[INFO] [2022-08-03 17:00:03] Storing 4 MetaTraits (11/7/9)
[STOP] [2022-08-03 17:00:03] parse_diff_and_store
[START] [2022-08-03 17:00:03] resolve_keys
[2022-08-03 17:00:03] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2022-08-03 17:00:10] Occurrences to nodes (through scientific_names)...
[INFO] [2022-08-03 17:00:10] traits to occurrences...
[INFO] [2022-08-03 17:00:10] traits to nodes (through occurrences)...
[INFO] [2022-08-03 17:00:10] Traits to sex term...
[INFO] [2022-08-03 17:00:10] Traits to lifestage term...
[INFO] [2022-08-03 17:00:10] MetaTraits to traits...
[INFO] [2022-08-03 17:00:10] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2022-08-03 17:00:10] Assocs to occurrences...
[INFO] [2022-08-03 17:00:10] Assocs to nodes...
[INFO] [2022-08-03 17:00:10] Assoc to sex term...
[INFO] [2022-08-03 17:00:11] Assoc to lifestage term...
[INFO] [2022-08-03 17:00:11] MetaAssoc to assocs...
[STOP] [2022-08-03 17:00:11] resolve_keys
[START] [2022-08-03 17:00:11] hold_for_later_1
[STOP] [2022-08-03 17:00:11] hold_for_later_1
[START] [2022-08-03 17:00:11] hold_for_later_2
[STOP] [2022-08-03 17:00:11] hold_for_later_2
[START] [2022-08-03 17:00:11] resolve_missing_parents
[STOP] [2022-08-03 17:00:11] resolve_missing_parents
[START] [2022-08-03 17:00:11] rebuild_nodes
[START] [2022-08-03 17:00:11] Flattener#flatten
[START] [2022-08-03 17:00:11] Flattener#study_resource
[START] [2022-08-03 17:00:11] Flattener#build_ancestry
[STOP] [2022-08-03 17:00:11] Flattener#build_ancestry
[INFO] [2022-08-03 17:00:11] 3 ancestry keys
[START] [2022-08-03 17:00:11] build_node_ancestors
[INFO] [2022-08-03 17:00:11] old ancestors deleted.
[STOP] [2022-08-03 17:00:11] build_node_ancestors
[WARN] [2022-08-03 17:00:11] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2022-08-03 17:00:11] Flattener#flatten
[STOP] [2022-08-03 17:00:11] rebuild_nodes
[START] [2022-08-03 17:00:11] resolve_missing_media_owners
[STOP] [2022-08-03 17:00:11] resolve_missing_media_owners
[START] [2022-08-03 17:00:11] sanitize_media_verbatims
[STOP] [2022-08-03 17:00:11] sanitize_media_verbatims
[START] [2022-08-03 17:00:11] queue_downloads
[STOP] [2022-08-03 17:00:11] queue_downloads
[START] [2022-08-03 17:00:11] parse_names
[WARN] [2022-08-03 17:00:11] I see 3 names which still need to be parsed.
[WARN] [2022-08-03 17:00:11] Names to parse: 3 formatted: 3 learned: 3 parsed: 3
[STOP] [2022-08-03 17:00:12] parse_names
[START] [2022-08-03 17:00:12] denormalize_canonical_names_to_nodes
[STOP] [2022-08-03 17:00:12] denormalize_canonical_names_to_nodes
[START] [2022-08-03 17:00:12] match_nodes
[START] [2022-08-03 17:00:12] map_all_nodes_to_pages
[STOP] [2022-08-03 17:00:12] map_all_nodes_to_pages
[INFO] [2022-08-03 17:00:12] ZERO unmatched nodes (of 3)! Nicely done.
[START] [2022-08-03 17:00:12] update_nodes
[STOP] [2022-08-03 17:00:12] update_nodes
[STOP] [2022-08-03 17:00:12] match_nodes
[START] [2022-08-03 17:00:12] reindex_search
[STOP] [2022-08-03 17:00:12] reindex_search
[START] [2022-08-03 17:00:12] normalize_units
[STOP] [2022-08-03 17:00:12] normalize_units
[START] [2022-08-03 17:00:12] calculate_statistics
[2022-08-03 17:00:12] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[INFO] [2022-08-03 17:00:12] Duplicate page_id count: 0
[STOP] [2022-08-03 17:00:12] calculate_statistics
[START] [2022-08-03 17:00:12] complete_harvest_instance
[START] [2022-08-03 17:00:12] overall_tsv_creation
[INFO] [2022-08-03 17:00:12] Exporting 3 nodes as TSV in batches of 10000...
[INFO] [2022-08-03 17:00:12] Processing group of 3 in 1 batches of 10000
[INFO] [2022-08-03 17:00:54] 3 Traits (unfiltered) and 0 associations...
[INFO] [2022-08-03 17:00:54] Building Traits map for 3 nodes (this can take a while)...
[INFO] [2022-08-03 17:01:49] Mapped 3 traits (4 meta) for 3 nodes.
[INFO] [2022-08-03 17:01:49] Building Associations map (this can take a while)...
[INFO] [2022-08-03 17:01:49] Done. 0 assocs mapped (0 meta).
[INFO] [2022-08-03 17:01:49] Adding 3 traits...
[INFO] [2022-08-03 17:01:49] 4 metadata added.
[INFO] [2022-08-03 17:01:49] Adding 0 assocs...
[INFO] [2022-08-03 17:01:49] 0 metadata added.
[INFO] [2022-08-03 17:02:32] Processed 3/3 nodes
[INFO] [2022-08-03 17:02:32] Average Time: 114.46
[INFO] [2022-08-03 17:02:32] Total Time: 2m20s
[STOP] [2022-08-03 17:02:32] overall_tsv_creation
[INFO] [2022-08-03 17:02:32] Done. Check your files:
[INFO] [2022-08-03 17:02:32] (3 lines) /app/public/data/mfeam/publish_nodes.tsv
[INFO] [2022-08-03 17:02:32] (3 lines) /app/public/data/mfeam/publish_scientific_names.tsv
[INFO] [2022-08-03 17:02:32] (4 lines) /app/public/data/mfeam/publish_traits.tsv
[INFO] [2022-08-03 17:02:32] (5 lines) /app/public/data/mfeam/publish_metadata.tsv
[STOP] [2022-08-03 17:02:32] complete_harvest_instance
[START] [2022-08-03 17:02:32] completed
[STOP] [2022-08-03 17:02:32] completed
[STOP] [2022-08-03 17:02:32] logged process, took 149.0

Latest Process