Stage:
completed
Fetched:
03 Apr 14:26
Validated:
03 Apr 14:26
Deltas Created
03 Apr 14:26
Units Normalized:
03 Apr 14:33
Ancestry Built:
03 Apr 14:29
Nodes Matched:
03 Apr 14:33
Names Parsed:
03 Apr 14:29
New Models Stored:
03 Apr 14:28
Indexed:
03 Apr 14:33
Completed:
03 Apr 14:38
Time to Harvest:
less than a minute
Harvesting Log
(257 lines)
[INFO] [2023-04-03 14:26:10] Created harvest instance #4327
[STOP] [2023-04-03 14:26:10] create_harvest_instance
[START] [2023-04-03 14:26:10] fetch_files
[STOP] [2023-04-03 14:26:10] fetch_files
[START] [2023-04-03 14:26:10] validate_each_file
[INFO] [2023-04-03 14:26:10] Looping over 5 formats...
[INFO] [2023-04-03 14:26:10] ...agents (/app/public/data/adw-birds/agent.tab)
[INFO] [2023-04-03 14:26:10] Valid: /app/public/data/adw-birds/converted_csv/adw-birds_agents_30269.csv (78 lines)
[INFO] [2023-04-03 14:26:10] ...nodes (/app/public/data/adw-birds/taxon.tab)
[INFO] [2023-04-03 14:26:10] Valid: /app/public/data/adw-birds/converted_csv/adw-birds_nodes_30268.csv (11044 lines)
[INFO] [2023-04-03 14:26:10] ...media (/app/public/data/adw-birds/media_resource.tab)
[INFO] [2023-04-03 14:26:12] Valid: /app/public/data/adw-birds/converted_csv/adw-birds_media_30270.csv (11333 lines)
[INFO] [2023-04-03 14:26:12] ...occurrences (/app/public/data/adw-birds/occurrence_specific.tab)
[INFO] [2023-04-03 14:26:13] Valid: /app/public/data/adw-birds/converted_csv/adw-birds_occurrences_30271.csv (74189 lines)
[INFO] [2023-04-03 14:26:13] ...measurements (/app/public/data/adw-birds/measurement_or_fact_specific.tab)
[INFO] [2023-04-03 14:26:16] Valid: /app/public/data/adw-birds/converted_csv/adw-birds_measurements_30272.csv (74713 lines)
[STOP] [2023-04-03 14:26:16] validate_each_file
[START] [2023-04-03 14:26:16] convert_to_csv
[INFO] [2023-04-03 14:26:16] Looping over 5 formats...
[INFO] [2023-04-03 14:26:16] ...agents (/app/public/data/adw-birds/agent.tab)
[CMD] [2023-04-03 14:26:16] /usr/bin/sort /app/public/data/adw-birds/converted_csv/adw-birds_agents_30269.csv > /app/public/data/adw-birds/converted_csv/adw-birds_agents_30269.csv_sorted
[INFO] [2023-04-03 14:26:16] Converted: /app/public/data/adw-birds/converted_csv/adw-birds_agents_30269.csv (78 lines)
[INFO] [2023-04-03 14:26:16] ...nodes (/app/public/data/adw-birds/taxon.tab)
[CMD] [2023-04-03 14:26:16] /usr/bin/sort /app/public/data/adw-birds/converted_csv/adw-birds_nodes_30268.csv > /app/public/data/adw-birds/converted_csv/adw-birds_nodes_30268.csv_sorted
[INFO] [2023-04-03 14:26:17] Converted: /app/public/data/adw-birds/converted_csv/adw-birds_nodes_30268.csv (11044 lines)
[INFO] [2023-04-03 14:26:17] ...media (/app/public/data/adw-birds/media_resource.tab)
[CMD] [2023-04-03 14:26:17] /usr/bin/sort /app/public/data/adw-birds/converted_csv/adw-birds_media_30270.csv > /app/public/data/adw-birds/converted_csv/adw-birds_media_30270.csv_sorted
[INFO] [2023-04-03 14:26:17] Converted: /app/public/data/adw-birds/converted_csv/adw-birds_media_30270.csv (11333 lines)
[INFO] [2023-04-03 14:26:17] ...occurrences (/app/public/data/adw-birds/occurrence_specific.tab)
[CMD] [2023-04-03 14:26:17] /usr/bin/sort /app/public/data/adw-birds/converted_csv/adw-birds_occurrences_30271.csv > /app/public/data/adw-birds/converted_csv/adw-birds_occurrences_30271.csv_sorted
[INFO] [2023-04-03 14:26:17] Converted: /app/public/data/adw-birds/converted_csv/adw-birds_occurrences_30271.csv (74189 lines)
[INFO] [2023-04-03 14:26:17] ...measurements (/app/public/data/adw-birds/measurement_or_fact_specific.tab)
[CMD] [2023-04-03 14:26:17] /usr/bin/sort /app/public/data/adw-birds/converted_csv/adw-birds_measurements_30272.csv > /app/public/data/adw-birds/converted_csv/adw-birds_measurements_30272.csv_sorted
[INFO] [2023-04-03 14:26:18] Converted: /app/public/data/adw-birds/converted_csv/adw-birds_measurements_30272.csv (74713 lines)
[STOP] [2023-04-03 14:26:18] convert_to_csv
[START] [2023-04-03 14:26:21] calculate_delta
[INFO] [2023-04-03 14:26:21] Looping over 5 formats...
[INFO] [2023-04-03 14:26:21] ...agents (/app/public/data/adw-birds/agent.tab)
[CMD] [2023-04-03 14:26:21] echo "0a" > /app/public/data/adw-birds/diff/adw-birds_agents_30269.diff
[CMD] [2023-04-03 14:26:21] tail -n +1 /app/public/data/adw-birds/converted_csv/adw-birds_agents_30269.csv >> /app/public/data/adw-birds/diff/adw-birds_agents_30269.diff
[CMD] [2023-04-03 14:26:21] echo "." >> /app/public/data/adw-birds/diff/adw-birds_agents_30269.diff
[INFO] [2023-04-03 14:26:22] Created diff: /app/public/data/adw-birds/diff/adw-birds_agents_30269.diff (80 lines)
[INFO] [2023-04-03 14:26:22] ...nodes (/app/public/data/adw-birds/taxon.tab)
[CMD] [2023-04-03 14:26:22] echo "0a" > /app/public/data/adw-birds/diff/adw-birds_nodes_30268.diff
[CMD] [2023-04-03 14:26:22] tail -n +1 /app/public/data/adw-birds/converted_csv/adw-birds_nodes_30268.csv >> /app/public/data/adw-birds/diff/adw-birds_nodes_30268.diff
[CMD] [2023-04-03 14:26:22] echo "." >> /app/public/data/adw-birds/diff/adw-birds_nodes_30268.diff
[INFO] [2023-04-03 14:26:22] Created diff: /app/public/data/adw-birds/diff/adw-birds_nodes_30268.diff (11046 lines)
[INFO] [2023-04-03 14:26:22] ...media (/app/public/data/adw-birds/media_resource.tab)
[CMD] [2023-04-03 14:26:22] echo "0a" > /app/public/data/adw-birds/diff/adw-birds_media_30270.diff
[CMD] [2023-04-03 14:26:22] tail -n +1 /app/public/data/adw-birds/converted_csv/adw-birds_media_30270.csv >> /app/public/data/adw-birds/diff/adw-birds_media_30270.diff
[CMD] [2023-04-03 14:26:23] echo "." >> /app/public/data/adw-birds/diff/adw-birds_media_30270.diff
[INFO] [2023-04-03 14:26:23] Created diff: /app/public/data/adw-birds/diff/adw-birds_media_30270.diff (11335 lines)
[INFO] [2023-04-03 14:26:23] ...occurrences (/app/public/data/adw-birds/occurrence_specific.tab)
[CMD] [2023-04-03 14:26:23] echo "0a" > /app/public/data/adw-birds/diff/adw-birds_occurrences_30271.diff
[CMD] [2023-04-03 14:26:23] tail -n +1 /app/public/data/adw-birds/converted_csv/adw-birds_occurrences_30271.csv >> /app/public/data/adw-birds/diff/adw-birds_occurrences_30271.diff
[CMD] [2023-04-03 14:26:23] echo "." >> /app/public/data/adw-birds/diff/adw-birds_occurrences_30271.diff
[INFO] [2023-04-03 14:26:24] Created diff: /app/public/data/adw-birds/diff/adw-birds_occurrences_30271.diff (74191 lines)
[INFO] [2023-04-03 14:26:24] ...measurements (/app/public/data/adw-birds/measurement_or_fact_specific.tab)
[CMD] [2023-04-03 14:26:24] echo "0a" > /app/public/data/adw-birds/diff/adw-birds_measurements_30272.diff
[CMD] [2023-04-03 14:26:24] tail -n +1 /app/public/data/adw-birds/converted_csv/adw-birds_measurements_30272.csv >> /app/public/data/adw-birds/diff/adw-birds_measurements_30272.diff
[CMD] [2023-04-03 14:26:24] echo "." >> /app/public/data/adw-birds/diff/adw-birds_measurements_30272.diff
[INFO] [2023-04-03 14:26:25] Created diff: /app/public/data/adw-birds/diff/adw-birds_measurements_30272.diff (74715 lines)
[STOP] [2023-04-03 14:26:25] calculate_delta
[START] [2023-04-03 14:26:25] parse_diff_and_store
[INFO] [2023-04-03 14:26:25] Handling diff: /app/public/data/adw-birds/diff/adw-birds_agents_30269.diff (80 lines)
[INFO] [2023-04-03 14:26:25] Loading agents diff file into memory (80 lines)...
[INFO] [2023-04-03 14:26:25] Storing 78 Attributions (78/78/80)
[INFO] [2023-04-03 14:26:25] Handling diff: /app/public/data/adw-birds/diff/adw-birds_nodes_30268.diff (11046 lines)
[INFO] [2023-04-03 14:26:25] Loading nodes diff file into memory (11046 lines)...
[INFO] [2023-04-03 14:26:29] Storing 11877 ScientificNames (23754/10000/11046)
[INFO] [2023-04-03 14:26:33] Storing 11877 Nodes (23754/10000/11046)
[WARN] [2023-04-03 14:26:40] SKIPPED 383 Scientific names (26788/11044/11046) with resource_pks already be in the database!
[WARN] [2023-04-03 14:26:40] SKIPPED 383 Nodes (26788/11044/11046) with resource_pks already be in the database!
[INFO] [2023-04-03 14:26:40] Storing 1134 ScientificNames (26788/11044/11046)
[INFO] [2023-04-03 14:26:40] Storing 1134 Nodes (26788/11044/11046)
[INFO] [2023-04-03 14:26:41] Handling diff: /app/public/data/adw-birds/diff/adw-birds_media_30270.diff (11335 lines)
[INFO] [2023-04-03 14:26:41] Loading media diff file into memory (11335 lines)...
[WARN] [2023-04-03 14:26:42] title is too long for medium 0cde34df08d656359a64ce59c661bc3a; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:43] title is too long for medium 1bb586c5dea5b4b6de798bf0d8443fcf; truncating to 254 chars: Acacia gazelles (Gazella gazella acaciae), an exceptionally rare subspecies of mountain gazelles (Gazella gazella). This critically endangered subspecies seems to have been isolated since the last ice age, they are quite distinct genetically. There are only about 20 individuals left in the wild....
[WARN] [2023-04-03 14:26:43] title is too long for medium 2525411684e09e43e98bae96d4136895; truncating to 254 chars: Great cormorant (Phalacrocorax carbo). This bird is being used by fishing people to capture fish. Notice the constriction near the base of the neck that prevents the bird from swallowing the fish. The fish is then removed when the birds' mouth is opened, stimulating the regurgitation reflex....
[WARN] [2023-04-03 14:26:43] title is too long for medium 254f18aab867c53ec3bcd3867a2241be; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:44] title is too long for medium 2ad3574c4b35d827deb317b664ec55ba; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:44] title is too long for medium 36340c2e86d1770f3d7602e84ce3f610; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:45] title is too long for medium 4068d66eeef68119cccb284ecd74e14f; truncating to 254 chars: Acacia gazelles (Gazella gazella acaciae), an exceptionally rare subspecies of mountain gazelles (Gazella gazella). This critically endangered subspecies seems to have been isolated since the last ice age, they are quite distinct genetically. There are only about 20 individuals left in the wild....
[WARN] [2023-04-03 14:26:45] title is too long for medium 464e02244277f1bf0a3ecea7619de208; truncating to 254 chars: King penguin (*Aptenodytes patagonicus*0\n \n Subject\n \n Live Animal\n \n \n \n Type\n \n Photo\n \n \n \n Life Stages And Gender\n \n Adult/Sexu...
[WARN] [2023-04-03 14:26:46] title is too long for medium 554f25de97b717762dc70907b6d1ac3f; truncating to 254 chars: Acacia gazelles (Gazella gazella acaciae), an exceptionally rare subspecies of mountain gazelles (Gazella gazella). This critically endangered subspecies seems to have been isolated since the last ice age, they are quite distinct genetically. There are only about 20 individuals left in the wild....
[WARN] [2023-04-03 14:26:46] title is too long for medium 58ee8d9daf5bad2fa652e98ffaf5a311; truncating to 254 chars: Black And White Colobus at the Toledo Zoo. The adult monkey on the right is holding a 6 day old baby colubus on its chest - you can see the head just over the top of the branch in the foreground of the picture. The young have pure white fur for the first weeks of their life. (Toledo, Ohio, March 1999)...
[WARN] [2023-04-03 14:26:46] title is too long for medium 5c043b357250a28dcd6c392bdcd165b5; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:47] title is too long for medium 66b9d5de9ab4a3ff114a589e1300c547; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:47] title is too long for medium 671411f39551fd83e3b33472bab8326e; truncating to 254 chars: Acacia gazelles (Gazella gazella acaciae), an exceptionally rare subspecies of mountain gazelles (Gazella gazella). This critically endangered subspecies seems to have been isolated since the last ice age, they are quite distinct genetically. There are only about 20 individuals left in the wild....
[WARN] [2023-04-03 14:26:49] title is too long for medium 91c7381f496061843285b5c0337dc138; truncating to 254 chars: Hilsa shad (Tenualosa ilisha). Local name in Pakistan: palla or pallo. The species is known for anadromous migration for spawning in River Indus. The larvae hatch in fresh water and when attain Juvenile stage migrate to sea and attain maturity runs towards the river Indus Spawning. (Ventral Fin: 7, Gill rakers 60 – 100, Scutes: 32-33, scales in lateral series: 37 - 47, Length Range: Juveniles (unsexed): 101-151 mm, male: 247-393 mm; 270-360 mm Female: 250-450 mm; 300-370 mm, Weight Range: Juveniles: 45 g...
[WARN] [2023-04-03 14:26:50] title is too long for medium a4a3527ffe83876f0af61ed8fb27923d; truncating to 254 chars: Typhlops schwartzi (blind snake, smaller, dark individual) and an Amphisbaena manni (burrowing legless lizard, larger, lighter individual). Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:51] title is too long for medium b48e09215a6e410c44539f7dee07ae02; truncating to 254 chars: Many bee and wasp species make nests with little rooms. Each room has just one egg and a supply of food (pollen for baby bees, paralyzed insects or spiders for baby wasps). The baby eats up its food, grows and transforms inside the nest, and emerges as an adult....
[WARN] [2023-04-03 14:26:51] title is too long for medium b9e6dc9440d888169cf5225664b600cc; truncating to 254 chars: Domestic dogs, border collies (Canis lupus familiaris). Adult dog displays the "merle" trait, a gene that differentially expresses the underlying black hair pigment gene. Juvenile dog is 6 months old and expresses classic black and white border collie markings....
[WARN] [2023-04-03 14:26:52] title is too long for medium c5312991ef57ed7a3d20be9fdefca1e0; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[WARN] [2023-04-03 14:26:52] title is too long for medium c59175d6393485e354523f049e3671b6; truncating to 254 chars: \nwhite-crowned sparrow\n \n Subject\n \n Live Animal\n \n \n \n Type\n \n Photo\n \n \n \n Life Stages And Gender\n \n Adult/Sexually Mature\n ...
[WARN] [2023-04-03 14:26:52] title is too long for medium cc4b6d77a22f526a0b0838c55d8638c6; truncating to 254 chars: Acacia gazelles (Gazella gazella acaciae), an exceptionally rare subspecies of mountain gazelles (Gazella gazella). This critically endangered subspecies seems to have been isolated since the last ice age, they are quite distinct genetically. There are only about 20 individuals left in the wild....
[WARN] [2023-04-03 14:26:53] title is too long for medium d8c3868d32b535c6179a3c0d6ce20cf7; truncating to 254 chars: Grass spiders, also called funnel weaver spiders (family Agelenidae) make flat or cup-shaped sheet webs with a tunnel at one corner that they hide in. The webs are not sticky, but the spiders can move very quickly on them -- dashing out of the tunnel to grab prey that walks on their web....
[INFO] [2023-04-03 14:26:53] Storing 9999 ContentAttributions (19998/10000/11335)
[INFO] [2023-04-03 14:26:56] Storing 9999 Media (19998/10000/11335)
[WARN] [2023-04-03 14:27:02] title is too long for medium eb7f29aa787a09334bd1a6ca909e2985; truncating to 254 chars: Found in mixed clay and leaf mold at the edge of a shaded dirt road being widened by hand during the construction phase of a church youth camp. These species were found together at the same site, both in the same shovel-full of dirt. Photographed in a white plastic five-gallon pail with a point-and-shoot digital camera on macro setting....
[INFO] [2023-04-03 14:27:03] Storing 1334 ContentAttributions (22666/11333/11335)
[INFO] [2023-04-03 14:27:04] Storing 1334 Media (22666/11333/11335)
[INFO] [2023-04-03 14:27:05] Handling diff: /app/public/data/adw-birds/diff/adw-birds_occurrences_30271.diff (74191 lines)
[INFO] [2023-04-03 14:27:05] Loading occurrences diff file into memory (74191 lines)...
[INFO] [2023-04-03 14:27:06] Storing 9999 Occurrences (9999/10000/74191)
[INFO] [2023-04-03 14:27:09] Storing 10000 Occurrences (19999/20000/74191)
[INFO] [2023-04-03 14:27:12] Storing 10000 Occurrences (29999/30000/74191)
[INFO] [2023-04-03 14:27:14] Storing 10000 Occurrences (39999/40000/74191)
[INFO] [2023-04-03 14:27:20] Storing 10000 Occurrences (49999/50000/74191)
[INFO] [2023-04-03 14:27:23] Storing 10000 Occurrences (59999/60000/74191)
[INFO] [2023-04-03 14:27:26] Storing 10000 Occurrences (69999/70000/74191)
[INFO] [2023-04-03 14:27:29] Storing 4190 Occurrences (74189/74189/74191)
[INFO] [2023-04-03 14:27:30] Handling diff: /app/public/data/adw-birds/diff/adw-birds_measurements_30272.diff (74715 lines)
[INFO] [2023-04-03 14:27:30] Loading measurements diff file into memory (74715 lines)...
[INFO] [2023-04-03 14:27:35] Storing 9999 Traits (19998/10000/74715)
[INFO] [2023-04-03 14:27:38] Storing 9999 MetaTraits (19998/10000/74715)
[INFO] [2023-04-03 14:27:44] Storing 10000 Traits (39998/20000/74715)
[INFO] [2023-04-03 14:27:47] Storing 10000 MetaTraits (39998/20000/74715)
[INFO] [2023-04-03 14:27:53] Storing 10000 Traits (59998/30000/74715)
[INFO] [2023-04-03 14:27:58] Storing 10000 MetaTraits (59998/30000/74715)
[INFO] [2023-04-03 14:28:04] Storing 10000 Traits (79998/40000/74715)
[INFO] [2023-04-03 14:28:08] Storing 10000 MetaTraits (79998/40000/74715)
[INFO] [2023-04-03 14:28:13] Storing 10000 Traits (99998/50000/74715)
[INFO] [2023-04-03 14:28:18] Storing 10000 MetaTraits (99998/50000/74715)
[INFO] [2023-04-03 14:28:23] Storing 10000 Traits (119998/60000/74715)
[INFO] [2023-04-03 14:28:27] Storing 10000 MetaTraits (119998/60000/74715)
[INFO] [2023-04-03 14:28:33] Storing 10000 Traits (139998/70000/74715)
[INFO] [2023-04-03 14:28:38] Storing 10000 MetaTraits (139998/70000/74715)
[INFO] [2023-04-03 14:28:41] Storing 4714 Traits (149426/74713/74715)
[INFO] [2023-04-03 14:28:43] Storing 4714 MetaTraits (149426/74713/74715)
[STOP] [2023-04-03 14:28:44] parse_diff_and_store
[START] [2023-04-03 14:28:44] resolve_keys
[2023-04-03 14:28:51] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2023-04-03 14:29:00] Occurrences to nodes (through scientific_names)...
[INFO] [2023-04-03 14:29:04] traits to occurrences...
[INFO] [2023-04-03 14:29:07] traits to nodes (through occurrences)...
[INFO] [2023-04-03 14:29:09] Traits to sex term...
[INFO] [2023-04-03 14:29:11] Traits to lifestage term...
[INFO] [2023-04-03 14:29:12] MetaTraits to traits...
[INFO] [2023-04-03 14:29:14] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2023-04-03 14:29:14] Assocs to occurrences...
[INFO] [2023-04-03 14:29:14] Assocs to nodes...
[INFO] [2023-04-03 14:29:14] Assoc to sex term...
[INFO] [2023-04-03 14:29:14] Assoc to lifestage term...
[INFO] [2023-04-03 14:29:14] MetaAssoc to assocs...
[STOP] [2023-04-03 14:29:14] resolve_keys
[START] [2023-04-03 14:29:14] hold_for_later_1
[STOP] [2023-04-03 14:29:14] hold_for_later_1
[START] [2023-04-03 14:29:14] hold_for_later_2
[STOP] [2023-04-03 14:29:14] hold_for_later_2
[START] [2023-04-03 14:29:14] resolve_missing_parents
[STOP] [2023-04-03 14:29:15] resolve_missing_parents
[START] [2023-04-03 14:29:15] rebuild_nodes
[START] [2023-04-03 14:29:15] Flattener#flatten
[START] [2023-04-03 14:29:15] Flattener#study_resource
[START] [2023-04-03 14:29:15] Flattener#build_ancestry
[STOP] [2023-04-03 14:29:16] Flattener#build_ancestry
[INFO] [2023-04-03 14:29:16] 13011 ancestry keys
[START] [2023-04-03 14:29:16] build_node_ancestors
[INFO] [2023-04-03 14:29:16] old ancestors deleted.
[STOP] [2023-04-03 14:29:17] build_node_ancestors
[START] [2023-04-03 14:29:20] Flattener#propagate_ancestor_ids
[STOP] [2023-04-03 14:29:20] Flattener#propagate_ancestor_ids
[STOP] [2023-04-03 14:29:20] Flattener#flatten
[STOP] [2023-04-03 14:29:20] rebuild_nodes
[START] [2023-04-03 14:29:20] resolve_missing_media_owners
[STOP] [2023-04-03 14:29:20] resolve_missing_media_owners
[START] [2023-04-03 14:29:20] sanitize_media_verbatims
[STOP] [2023-04-03 14:29:20] sanitize_media_verbatims
[START] [2023-04-03 14:29:20] queue_downloads
[STOP] [2023-04-03 14:29:20] queue_downloads
[START] [2023-04-03 14:29:20] parse_names
[WARN] [2023-04-03 14:29:21] I see 13011 names which still need to be parsed.
[INFO] [2023-04-03 14:29:21] 0% of media downloaded
[WARN] [2023-04-03 14:29:22] Names to parse: 10000 formatted: 10000 learned: 9836 parsed: 10000
[WARN] [2023-04-03 14:29:32] Names to parse: 3011 formatted: 3011 learned: 2999 parsed: 3011
[STOP] [2023-04-03 14:29:35] parse_names
[START] [2023-04-03 14:29:35] denormalize_canonical_names_to_nodes
[STOP] [2023-04-03 14:29:36] denormalize_canonical_names_to_nodes
[START] [2023-04-03 14:29:36] match_nodes
[START] [2023-04-03 14:29:36] map_all_nodes_to_pages
[INFO] [2023-04-03 14:32:20] 40% of media downloaded
[STOP] [2023-04-03 14:33:18] map_all_nodes_to_pages
[INFO] [2023-04-03 14:33:18] 731 Unmatched nodes (of 13011)! That's too many to output. Full list in /app/public/data/adw-birds/unmatched_nodes.txt ; First 10: Canonical: Acronicta oblinita; Node#134250413; ResourceID: Acronicta_oblinita; Canonical: Acronicta vulpina; Node#134250417; ResourceID: Acronicta_vulpina; Canonical: Cosmia calami; Node#134253483; ResourceID: Cosmia_calami; Canonical: Eupsilia vinulenta; Node#134255075; ResourceID: Eupsilia_vinulenta; Canonical: Himella fidelis; Node#134255994; ResourceID: Himella_fidelis; Canonical: Leucania multilinea; Node#134256913; ResourceID: Leucania_multilinea; Canonical: Leucania pseudargyria; Node#134256914; ResourceID: Leucania_pseudargyria; Canonical: Lithophane grotei; Node#134257052; ResourceID: Lithophane_grotei; Canonical: Renia factiosalis; Node#134260974; ResourceID: Renia_factiosalis; Canonical: Hylephila phyleus; Node#134256166; ResourceID: Hylephila_phyleus
[START] [2023-04-03 14:33:18] update_nodes
[STOP] [2023-04-03 14:33:25] update_nodes
[STOP] [2023-04-03 14:33:25] match_nodes
[START] [2023-04-03 14:33:25] reindex_search
[STOP] [2023-04-03 14:33:38] reindex_search
[START] [2023-04-03 14:33:38] normalize_units
[STOP] [2023-04-03 14:33:38] normalize_units
[START] [2023-04-03 14:33:38] calculate_statistics
[INFO] [2023-04-03 14:34:10] Duplicate page_id count: 0
[STOP] [2023-04-03 14:34:10] calculate_statistics
[START] [2023-04-03 14:34:10] complete_harvest_instance
[START] [2023-04-03 14:34:10] overall_tsv_creation
[INFO] [2023-04-03 14:34:11] Exporting 13011 nodes as TSV in batches of 10000...
[INFO] [2023-04-03 14:34:11] Processing group of 13011 in 2 batches of 10000
[INFO] [2023-04-03 14:34:59] 56598 Traits (unfiltered) and 0 associations...
[INFO] [2023-04-03 14:34:59] Building Traits map for 10000 nodes (this can take a while)...
[INFO] [2023-04-03 14:35:28] 90% of media downloaded
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[INFO] [2023-04-03 14:36:00] 100% of media downloaded
[ERR] [2023-04-03 14:36:00][hdls] NO additional images were found to download
[INFO] [2023-04-03 14:36:09] Mapped 56598 traits (56598 meta) for 10000 nodes.
[INFO] [2023-04-03 14:36:09] Building Associations map (this can take a while)...
[INFO] [2023-04-03 14:36:12] Done. 0 assocs mapped (0 meta).
[INFO] [2023-04-03 14:36:12] Adding 56598 traits...
[INFO] [2023-04-03 14:36:17] 0 metadata added.
[INFO] [2023-04-03 14:36:17] Adding 0 assocs...
[INFO] [2023-04-03 14:36:17] 0 metadata added.
[INFO] [2023-04-03 14:37:22] Processed 10000/13011 nodes
[INFO] [2023-04-03 14:37:31] 18115 Traits (unfiltered) and 0 associations...
[INFO] [2023-04-03 14:37:31] Building Traits map for 3011 nodes (this can take a while)...
[INFO] [2023-04-03 14:37:43] Mapped 18115 traits (18115 meta) for 3011 nodes.
[INFO] [2023-04-03 14:37:43] Building Associations map (this can take a while)...
[INFO] [2023-04-03 14:37:43] Done. 0 assocs mapped (0 meta).
[INFO] [2023-04-03 14:37:43] Adding 18115 traits...
[INFO] [2023-04-03 14:37:45] 0 metadata added.
[INFO] [2023-04-03 14:37:45] Adding 0 assocs...
[INFO] [2023-04-03 14:37:45] 0 metadata added.
[INFO] [2023-04-03 14:38:38] Processed 13011/13011 nodes
[INFO] [2023-04-03 14:38:38] Average Time: 128.735
[INFO] [2023-04-03 14:38:38] Total Time: 4m28s
[STOP] [2023-04-03 14:38:38] overall_tsv_creation
[INFO] [2023-04-03 14:38:38] Done. Check your files:
[INFO] [2023-04-03 14:38:38] (13001 lines) /app/public/data/adw-birds/publish_nodes.tsv
[INFO] [2023-04-03 14:38:38] (35791 lines) /app/public/data/adw-birds/publish_node_ancestors.tsv
[INFO] [2023-04-03 14:38:39] (13011 lines) /app/public/data/adw-birds/publish_scientific_names.tsv
[INFO] [2023-04-03 14:38:39] (11333 lines) /app/public/data/adw-birds/publish_media.tsv
[INFO] [2023-04-03 14:38:39] (8881 lines) /app/public/data/adw-birds/publish_image_info.tsv
[INFO] [2023-04-03 14:38:39] (11333 lines) /app/public/data/adw-birds/publish_attributions.tsv
[INFO] [2023-04-03 14:38:39] (74714 lines) /app/public/data/adw-birds/publish_traits.tsv
[INFO] [2023-04-03 14:38:40] (1 lines) /app/public/data/adw-birds/publish_metadata.tsv
[STOP] [2023-04-03 14:38:40] complete_harvest_instance
[START] [2023-04-03 14:38:40] completed
[STOP] [2023-04-03 14:38:40] completed
[STOP] [2023-04-03 14:38:40] logged process, took 749.99
Latest Process