Stage:
completed
Fetched:
07 Jun 15:42
Validated:
07 Jun 15:42
Deltas Created
07 Jun 15:42
Units Normalized:
07 Jun 15:43
Ancestry Built:
07 Jun 15:43
Nodes Matched:
07 Jun 15:43
Names Parsed:
07 Jun 15:43
New Models Stored:
07 Jun 15:43
Indexed:
07 Jun 15:43
Completed:
07 Jun 15:45
Time to Harvest:
less than a minute
Harvesting Log
(180 lines)
[INFO] [2021-06-07 15:42:44] Created harvest instance #3997
[STOP] [2021-06-07 15:42:44] create_harvest_instance
[START] [2021-06-07 15:42:44] fetch_files
[STOP] [2021-06-07 15:42:44] fetch_files
[START] [2021-06-07 15:42:44] validate_each_file
[INFO] [2021-06-07 15:42:44] Looping over 3 formats...
[INFO] [2021-06-07 15:42:44] ...nodes (/app/public/data/rwctdh/taxon.tab)
[INFO] [2021-06-07 15:42:44] Valid: /app/public/converted_csv/rwctdh_nodes_3997.csv (968 lines)
[INFO] [2021-06-07 15:42:44] ...occurrences (/app/public/data/rwctdh/occurrence.tab)
[INFO] [2021-06-07 15:42:44] Valid: /app/public/converted_csv/rwctdh_occurrences_3997.csv (2838 lines)
[INFO] [2021-06-07 15:42:44] ...measurements (/app/public/data/rwctdh/measurement_or_fact_specific.tab)
[INFO] [2021-06-07 15:42:45] Valid: /app/public/converted_csv/rwctdh_measurements_3997.csv (21485 lines)
[STOP] [2021-06-07 15:42:45] validate_each_file
[START] [2021-06-07 15:42:45] convert_to_csv
[INFO] [2021-06-07 15:42:45] Looping over 3 formats...
[INFO] [2021-06-07 15:42:45] ...nodes (/app/public/data/rwctdh/taxon.tab)
[CMD] [2021-06-07 15:42:45] /usr/bin/sort /app/public/converted_csv/rwctdh_nodes_3997.csv > /app/public/converted_csv/rwctdh_nodes_3997.csv_sorted
[INFO] [2021-06-07 15:42:46] Converted: /app/public/converted_csv/rwctdh_nodes_3997.csv (968 lines)
[INFO] [2021-06-07 15:42:46] ...occurrences (/app/public/data/rwctdh/occurrence.tab)
[CMD] [2021-06-07 15:42:46] /usr/bin/sort /app/public/converted_csv/rwctdh_occurrences_3997.csv > /app/public/converted_csv/rwctdh_occurrences_3997.csv_sorted
[INFO] [2021-06-07 15:42:46] Converted: /app/public/converted_csv/rwctdh_occurrences_3997.csv (2838 lines)
[INFO] [2021-06-07 15:42:46] ...measurements (/app/public/data/rwctdh/measurement_or_fact_specific.tab)
[CMD] [2021-06-07 15:42:46] /usr/bin/sort /app/public/converted_csv/rwctdh_measurements_3997.csv > /app/public/converted_csv/rwctdh_measurements_3997.csv_sorted
[INFO] [2021-06-07 15:42:46] Converted: /app/public/converted_csv/rwctdh_measurements_3997.csv (21485 lines)
[STOP] [2021-06-07 15:42:46] convert_to_csv
[START] [2021-06-07 15:42:46] calculate_delta
[INFO] [2021-06-07 15:42:46] Looping over 3 formats...
[INFO] [2021-06-07 15:42:46] ...nodes (/app/public/data/rwctdh/taxon.tab)
[CMD] [2021-06-07 15:42:46] echo "0a" > /app/public/diff/rwctdh_nodes_3997.diff
[CMD] [2021-06-07 15:42:46] tail -n +1 /app/public/converted_csv/rwctdh_nodes_3997.csv >> /app/public/diff/rwctdh_nodes_3997.diff
[CMD] [2021-06-07 15:42:47] echo "." >> /app/public/diff/rwctdh_nodes_3997.diff
[INFO] [2021-06-07 15:42:47] Created diff: /app/public/diff/rwctdh_nodes_3997.diff (970 lines)
[INFO] [2021-06-07 15:42:47] ...occurrences (/app/public/data/rwctdh/occurrence.tab)
[CMD] [2021-06-07 15:42:47] echo "0a" > /app/public/diff/rwctdh_occurrences_3997.diff
[CMD] [2021-06-07 15:42:47] tail -n +1 /app/public/converted_csv/rwctdh_occurrences_3997.csv >> /app/public/diff/rwctdh_occurrences_3997.diff
[CMD] [2021-06-07 15:42:47] echo "." >> /app/public/diff/rwctdh_occurrences_3997.diff
[INFO] [2021-06-07 15:42:48] Created diff: /app/public/diff/rwctdh_occurrences_3997.diff (2840 lines)
[INFO] [2021-06-07 15:42:48] ...measurements (/app/public/data/rwctdh/measurement_or_fact_specific.tab)
[CMD] [2021-06-07 15:42:48] echo "0a" > /app/public/diff/rwctdh_measurements_3997.diff
[CMD] [2021-06-07 15:42:48] tail -n +1 /app/public/converted_csv/rwctdh_measurements_3997.csv >> /app/public/diff/rwctdh_measurements_3997.diff
[CMD] [2021-06-07 15:42:48] echo "." >> /app/public/diff/rwctdh_measurements_3997.diff
[INFO] [2021-06-07 15:42:48] Created diff: /app/public/diff/rwctdh_measurements_3997.diff (21487 lines)
[STOP] [2021-06-07 15:42:48] calculate_delta
[START] [2021-06-07 15:42:48] parse_diff_and_store
[INFO] [2021-06-07 15:42:48] Handling diff: /app/public/diff/rwctdh_nodes_3997.diff (970 lines)
[INFO] [2021-06-07 15:42:49] Loading nodes diff file into memory (970 /app/public/diff/rwctdh_nodes_3997.diff lines)...
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Kellicottia longispina (Kellicott, 1879)` to `Kellicottia longispina (Kellicott, 1879)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Euchlanis calpidia (Myers, 1930)` to `Euchlanis calpidia (Myers, 1930)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Anuraeopsis fissa (Gosse, 1851)` to `Anuraeopsis fissa (Gosse, 1851)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Trichocerca collaris (Rousselet, 1896)` to `Trichocerca collaris (Rousselet, 1896)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lecane subulata (Harring et Myers, 1926)` to `Lecane subulata (Harring et Myers, 1926)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lecane thienemanni (Hauer, 1938)` to `Lecane thienemanni (Hauer, 1938)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Pseudoploesoma formosum (Myers, 1934)` to `Pseudoploesoma formosum (Myers, 1934)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Notholca squamula (Müller, 1786)` to `Notholca squamula (Müller, 1786)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lecane vastita (Harring et Myers, 1926)` to `Lecane vastita (Harring et Myers, 1926)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Trichocerca pusilla (Jennings, 1903)` to `Trichocerca pusilla (Jennings, 1903)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Resticula gelida (Harring et Myers, 1922)` to `Resticula gelida (Harring et Myers, 1922)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Cephalodella auriculata (Müller, 1773)` to `Cephalodella auriculata (Müller, 1773)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Keratella testudo (Ehrenberg, 1832)` to `Keratella testudo (Ehrenberg, 1832)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lindia gravitata (Lie-Pettersen, 1905)` to `Lindia gravitata (Lie-Pettersen, 1905)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Proales minima (Montet, 1915)` to `Proales minima (Montet, 1915)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Colurella colurus (Ehrenberg, 1830)` to `Colurella colurus (Ehrenberg, 1830)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Cephalodella sterea (Gosse, 1887)` to `Cephalodella sterea (Gosse, 1887)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lecane papuana (Murray, 1913)` to `Lecane papuana (Murray, 1913)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Brachionus diversicornis (Daday, 1883)` to `Brachionus diversicornis (Daday, 1883)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Mytilina bisulcata (Lucks, 1912)` to `Mytilina bisulcata (Lucks, 1912)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lecane quadridentata (Ehrenberg, 1830)` to `Lecane quadridentata (Ehrenberg, 1830)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lepadella patella (Müller, 1773)` to `Lepadella patella (Müller, 1773)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lecane nana (Murray, 1913)` to `Lecane nana (Murray, 1913)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Lepadella quadricarinata (Stenroos, 1898)` to `Lepadella quadricarinata (Stenroos, 1898)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Collotheca pelagica (Rousselet, 1893)` to `Collotheca pelagica (Rousselet, 1893)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Trichocerca vernalis (Hauer, 1936)` to `Trichocerca vernalis (Hauer, 1936)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Trichocerca cavia (Gosse, 1886)` to `Trichocerca cavia (Gosse, 1886)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Ascomorpha ovalis (Bergendal, 1892)` to `Ascomorpha ovalis (Bergendal, 1892)`
[WARN] [2021-06-07 15:42:49] Filtered Scientific Name `Squatinella rostrum (Schmarda, 1846)` to `Squatinella rostrum (Schmarda, 1846)`
[WARN] [2021-06-07 15:42:49] (Reached filtered-name limit; supressing further warnings.)
[INFO] [2021-06-07 15:42:49] Handling diff: /app/public/diff/rwctdh_occurrences_3997.diff (2840 lines)
[INFO] [2021-06-07 15:42:49] Loading occurrences diff file into memory (2840 /app/public/diff/rwctdh_occurrences_3997.diff lines)...
[INFO] [2021-06-07 15:42:50] Handling diff: /app/public/diff/rwctdh_measurements_3997.diff (21487 lines)
[INFO] [2021-06-07 15:42:50] Loading measurements diff file into memory (21487 /app/public/diff/rwctdh_measurements_3997.diff lines)...
[INFO] [2021-06-07 15:42:59] Storing 968 ScientificNames
[INFO] [2021-06-07 15:42:59] Processing group of 968 in 1 groups of 1000
[INFO] [2021-06-07 15:42:59] Average Time: 0.33
[INFO] [2021-06-07 15:42:59] Total Time: 1s
[INFO] [2021-06-07 15:42:59] Storing 968 Nodes
[INFO] [2021-06-07 15:42:59] Processing group of 968 in 1 groups of 1000
[INFO] [2021-06-07 15:42:59] Average Time: 0.26
[INFO] [2021-06-07 15:42:59] Total Time: 1s
[INFO] [2021-06-07 15:42:59] Storing 2838 Occurrences
[INFO] [2021-06-07 15:42:59] Processing group of 2838 in 3 groups of 1000
[INFO] [2021-06-07 15:43:00] Average Time: 0.107
[INFO] [2021-06-07 15:43:00] Total Time: 1s
[INFO] [2021-06-07 15:43:00] Storing 21485 Traits
[INFO] [2021-06-07 15:43:00] Processing group of 21485 in 22 groups of 1000
[INFO] [2021-06-07 15:43:06] Average Time: 0.303
[INFO] [2021-06-07 15:43:06] Total Time: 7s
[INFO] [2021-06-07 15:43:06] last 3 / first 3: 0.79
[INFO] [2021-06-07 15:43:06] Std.Dev: 0.06324555320336758; Max: 0.44
[INFO] [2021-06-07 15:43:06] Storing 1674 MetaTraits
[INFO] [2021-06-07 15:43:06] Processing group of 1674 in 2 groups of 1000
[INFO] [2021-06-07 15:43:07] Average Time: 0.095
[INFO] [2021-06-07 15:43:07] Total Time: 1s
[STOP] [2021-06-07 15:43:07] parse_diff_and_store
[START] [2021-06-07 15:43:07] resolve_keys
[INFO] [2021-06-07 15:43:13] Occurrences to nodes (through scientific_names)...
[INFO] [2021-06-07 15:43:13] traits to occurrences...
[INFO] [2021-06-07 15:43:13] traits to nodes (through occurrences)...
[INFO] [2021-06-07 15:43:13] Traits to sex term...
[INFO] [2021-06-07 15:43:13] Traits to lifestage term...
[INFO] [2021-06-07 15:43:13] MetaTraits to traits...
[INFO] [2021-06-07 15:43:13] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-06-07 15:43:14] Assocs to occurrences...
[INFO] [2021-06-07 15:43:14] Assocs to nodes...
[INFO] [2021-06-07 15:43:14] Assoc to sex term...
[INFO] [2021-06-07 15:43:14] Assoc to lifestage term...
[INFO] [2021-06-07 15:43:14] MetaAssoc to assocs...
[STOP] [2021-06-07 15:43:14] resolve_keys
[START] [2021-06-07 15:43:14] hold_for_later_1
[STOP] [2021-06-07 15:43:14] hold_for_later_1
[START] [2021-06-07 15:43:14] hold_for_later_2
[STOP] [2021-06-07 15:43:14] hold_for_later_2
[START] [2021-06-07 15:43:14] resolve_missing_parents
[STOP] [2021-06-07 15:43:14] resolve_missing_parents
[START] [2021-06-07 15:43:14] rebuild_nodes
[START] [2021-06-07 15:43:14] Flattener#flatten
[START] [2021-06-07 15:43:14] Flattener#study_resource
[START] [2021-06-07 15:43:14] Flattener#build_ancestry
[STOP] [2021-06-07 15:43:14] Flattener#build_ancestry
[INFO] [2021-06-07 15:43:14] 968 ancestry keys
[START] [2021-06-07 15:43:14] build_node_ancestors
[INFO] [2021-06-07 15:43:14] old ancestors deleted.
[STOP] [2021-06-07 15:43:14] build_node_ancestors
[WARN] [2021-06-07 15:43:14] Flattener: nothing to flatten! (Completely flat resource?)
[STOP] [2021-06-07 15:43:14] Flattener#flatten
[STOP] [2021-06-07 15:43:14] rebuild_nodes
[START] [2021-06-07 15:43:14] resolve_missing_media_owners
[STOP] [2021-06-07 15:43:14] resolve_missing_media_owners
[START] [2021-06-07 15:43:14] sanitize_media_verbatims
[STOP] [2021-06-07 15:43:14] sanitize_media_verbatims
[START] [2021-06-07 15:43:14] queue_downloads
[STOP] [2021-06-07 15:43:14] queue_downloads
[START] [2021-06-07 15:43:14] parse_names
[WARN] [2021-06-07 15:43:14] I see 968 names which still need to be parsed.
[STOP] [2021-06-07 15:43:16] parse_names
[START] [2021-06-07 15:43:16] denormalize_canonical_names_to_nodes
[STOP] [2021-06-07 15:43:16] denormalize_canonical_names_to_nodes
[START] [2021-06-07 15:43:16] match_nodes
[START] [2021-06-07 15:43:16] map_all_nodes_to_pages
[STOP] [2021-06-07 15:43:28] map_all_nodes_to_pages
[INFO] [2021-06-07 15:43:28] 84 Unmatched nodes (of 968)! That's too many to output. Full list in /app/public/data/rwctdh/unmatched_nodes.txt ; First 10: Canonical: Trichotria brevidactyla; Node#95786577; ResourceID: 00573e1ba7fc27d57b9d2b11dc0c0f0d; Canonical: Brachionus havanaensis minnesotensis; Node#95786589; ResourceID: 02a1868d9d5bef6e15333bc0d590243f; Canonical: Lecane vastita; Node#95786618; ResourceID: 0a935dcaaf0d55271506e6e66481a16e; Canonical: Collotheca orchidacea; Node#95786619; ResourceID: 0ad846191e0e66861546c0c1d1407a08; Canonical: Brachionus edentatus; Node#95786638; ResourceID: 0e64a727f7c88daae9a2801d107d5534; Canonical: Lepadella lata ovata; Node#95786662; ResourceID: 130aabe3329fa62cdd922a1291f28081; Canonical: Lecane hamata fernandoi; Node#95786670; ResourceID: 15d4c48fa10f63876c3e49163ec405fa; Canonical: Brachionus caudatus aculeatus; Node#95786673; ResourceID: 166566453000732efe35c65428884ef2; Canonical: Brachionus dimidiatus isigakiensis; Node#95786685; ResourceID: 185967313c055f1a7b638e849e4936df; Canonical: Lecane bulla dentata; Node#95786699; ResourceID: 1b2a43e4ed4bae8158053c42492dc843
[START] [2021-06-07 15:43:28] update_nodes
[STOP] [2021-06-07 15:43:28] update_nodes
[STOP] [2021-06-07 15:43:28] match_nodes
[START] [2021-06-07 15:43:28] reindex_search
[STOP] [2021-06-07 15:43:28] reindex_search
[START] [2021-06-07 15:43:28] normalize_units
[STOP] [2021-06-07 15:43:31] normalize_units
[START] [2021-06-07 15:43:31] calculate_statistics
[2021-06-07 15:43:31] ZERO NODE ANCESTORS. Is this actually a completely flat resource?
[STOP] [2021-06-07 15:43:31] calculate_statistics
[START] [2021-06-07 15:43:31] complete_harvest_instance
[START] [2021-06-07 15:43:31] overall_tsv_creation
[INFO] [2021-06-07 15:43:31] Processing group of 968 in 1 batches of 10000
[INFO] [2021-06-07 15:44:10] 2837 Traits (unfiltered)...
[INFO] [2021-06-07 15:44:58] 2837 Traits (filtered)...
[INFO] [2021-06-07 15:44:58] 0 Associations (filtered)...
[INFO] [2021-06-07 15:44:59] 17454 metadata added.
[INFO] [2021-06-07 15:44:59] 0 metadata added.
[INFO] [2021-06-07 15:45:27] Average Time: 90.73
[INFO] [2021-06-07 15:45:27] Total Time: 1m57s
[STOP] [2021-06-07 15:45:27] overall_tsv_creation
[INFO] [2021-06-07 15:45:27] Done. Check your files:
[INFO] [2021-06-07 15:45:27] (968 lines) /app/public/data/rwctdh/publish_nodes.tsv
[INFO] [2021-06-07 15:45:28] (968 lines) /app/public/data/rwctdh/publish_scientific_names.tsv
[INFO] [2021-06-07 15:45:28] (2838 lines) /app/public/data/rwctdh/publish_traits.tsv
[INFO] [2021-06-07 15:45:28] (17455 lines) /app/public/data/rwctdh/publish_metadata.tsv
[STOP] [2021-06-07 15:45:28] complete_harvest_instance
[START] [2021-06-07 15:45:28] completed
[STOP] [2021-06-07 15:45:28] completed
[STOP] [2021-06-07 15:45:28] logged process, took 164.73
Latest Process