Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import issues #1197

Open
reedy opened this issue Apr 23, 2023 · 11 comments
Open

Import issues #1197

reedy opened this issue Apr 23, 2023 · 11 comments
Labels
bug import Issues related to data import (e.g. CSV, FlightMemory).

Comments

@reedy
Copy link
Collaborator

reedy commented Apr 23, 2023

On my dev install... I deleted all my flights (using the UI) and then imported a newer backup from openflights.org

It seems to have resulted in some weird plane type attributions, even though the exported CSV looks good

Screenshot 2023-04-23 at 18 49 28

Should this be showing IDs or types?

Screenshot 2023-04-23 at 18 50 44

@reedy reedy added bug import Issues related to data import (e.g. CSV, FlightMemory). labels Apr 23, 2023
@reedy
Copy link
Collaborator Author

reedy commented Apr 23, 2023

I doubt the fix to #1116 is to blame....

@reedy
Copy link
Collaborator Author

reedy commented Apr 23, 2023

But I'm guessing something to do with differing plane ids locally..?

mysql> select * from planes;
+-------------------------+------+-------+------+--------+
| name                    | abbr | speed | plid | public |
+-------------------------+------+-------+------+--------+
| Airbus A320             | NULL |  NULL |    1 | NULL   |
| Boeing 777-300ER        | NULL |  NULL |    2 | NULL   |
| Boeing 737-MAX9         | NULL |  NULL |    3 | NULL   |
| Boeing 737-800          | NULL |  NULL |    4 | NULL   |
| Embraer ERJ175          | NULL |  NULL |    5 | NULL   |
| Airbus A319             | NULL |  NULL |    6 | NULL   |
| Airbus A321NEO          | NULL |  NULL |    7 | NULL   |
| Embraer E175            | NULL |  NULL |    8 | NULL   |
| Airbus A321             | NULL |  NULL |    9 | NULL   |
| Airbus A350-900         | NULL |  NULL |   10 | NULL   |
| Airbus A330-300         | NULL |  NULL |   11 | NULL   |
| Embraer E190            | NULL |  NULL |   12 | NULL   |
| Airbus A320NEO          | NULL |  NULL |   13 | NULL   |
| Embraer 190             | NULL |  NULL |   14 | NULL   |
| Boeing 777-200          | NULL |  NULL |   15 | NULL   |
| Boeing 787-8            | NULL |  NULL |   16 | NULL   |
| BAE Jetstream J41       | NULL |  NULL |   17 | NULL   |
| Airbus A380             | NULL |  NULL |   18 | NULL   |
| Embraer E190SR          | NULL |  NULL |   19 | NULL   |
| Bombardier CRJ-900      | NULL |  NULL |   20 | NULL   |
| Boeing 737-700          | NULL |  NULL |   21 | NULL   |
| Boeing 747-400          | NULL |  NULL |   22 | NULL   |
| Embraer RJ145           | NULL |  NULL |   23 | NULL   |
| Airbus A340-600         | NULL |  NULL |   24 | NULL   |
| Boeing 777-200ER        | NULL |  NULL |   25 | NULL   |
| Bombardier CRJ-1000     | NULL |  NULL |   26 | NULL   |
| Airbus A350-1000        | NULL |  NULL |   27 | NULL   |
| Boeing 767-300          | NULL |  NULL |   28 | NULL   |
| Airbus A330-200         | NULL |  NULL |   29 | NULL   |
| Boeing 757-200          | NULL |  NULL |   30 | NULL   |
| Boeing 787-9            | NULL |  NULL |   31 | NULL   |
| Boeing 777-200LR        | NULL |  NULL |   32 | NULL   |
| Airbus A318             | NULL |  NULL |   33 | NULL   |
| McDonnell Douglas MD-83 | NULL |  NULL |   34 | NULL   |
| Boeing 767-300ER        | NULL |  NULL |   35 | NULL   |
| Embraer E170            | NULL |  NULL |   36 | NULL   |
| Canadair CRJ-700        | NULL |  NULL |   37 | NULL   |
| Canadair CRJ-900        | NULL |  NULL |   38 | NULL   |
| McDonnell Douglas MD83  | NULL |  NULL |   39 | NULL   |
| Boeing 787-800          | NULL |  NULL |   40 | NULL   |
| Bombardier Q400         | NULL |  NULL |   41 | NULL   |
| Boeing 737-400          | NULL |  NULL |   42 | NULL   |
| Boeing 737-300          | NULL |  NULL |   43 | NULL   |
| Airbus A321T            | NULL |  NULL |   44 | NULL   |
| Airbus A320-200         | NULL |  NULL |   45 | NULL   |
| Avro RJ85               | NULL |  NULL |   46 | NULL   |
| Dornier J328            | NULL |  NULL |   47 | NULL   |
| Fokker 100              | NULL |  NULL |   48 | NULL   |
| Airbus A321-200         | NULL |  NULL |   49 | NULL   |
| ATR 72                  | NULL |  NULL |   50 | NULL   |
| Saab 2000               | NULL |  NULL |   51 | NULL   |
| McDonnell Douglas MD80  | NULL |  NULL |   52 | NULL   |
| Canadair CRJ-200        | NULL |  NULL |   53 | NULL   |
| Embraer ERJ-145         | NULL |  NULL |   54 | NULL   |
| Boeing 767-200          | NULL |  NULL |   55 | NULL   |
| McDonnell Douglas MD82  | NULL |  NULL |   56 | NULL   |
| Fokker 70               | NULL |  NULL |   57 | NULL   |
| BAe 146-200             | NULL |  NULL |   58 | NULL   |
| Boeing 767              | NULL |  NULL |   59 | NULL   |
+-------------------------+------+-------+------+--------+
59 rows in set (0.00 sec)

If I truncate the table, then reimport... They all come in empty...

@reedy
Copy link
Collaborator Author

reedy commented Apr 23, 2023

Caused by 2d82511 as the plane id (plid) is actually being exported?

@chrisrosset
Copy link
Collaborator

chrisrosset commented Jun 22, 2023

But I'm guessing something to do with differing plane ids locally..?

mysql> select * from planes;
+-------------------------+------+-------+------+--------+
| name                    | abbr | speed | plid | public |
+-------------------------+------+-------+------+--------+
| Airbus A320             | NULL |  NULL |    1 | NULL   |

snip

| Boeing 767              | NULL |  NULL |   59 | NULL   |
+-------------------------+------+-------+------+--------+
59 rows in set (0.00 sec)

I notice that planes.dat does not contain a plid column. With the live prod database having been updated multiple times, I don't think we can easily replicate the IDs ourselves.

@chrisrosset
Copy link
Collaborator

Caused by 2d82511 as the plane id (plid) is actually being exported?

@reedy, I've looked into this today and I don't think this is a bug.

The export/backup functionality will export the database IDs for multiple entities:

Data CSV column name
Origin Airport From_OID
Destination Airport To_OID
Airline Airline_OID
Plane Type Plane_OID

When present in the upload payload, this allows the import code to easily match entities even if they were renamed or otherwise updated between the backup creation and upload. Importantly, during parsing, these take priority over the string columns (e.g. airline, plane) 1.

airports.dat and airlines.dat both contain an ID column which load-data.sql uses when seeding the database with data. This makes the IDs consistent between the real openflights.org and dev setups.

However, planes.dat does not have an ID column. As a result, your local instance with either fail to match planes or match them incorrectly.

The solution is to wipe the Plane_OID column (best to wipe all the OID columns) in your CSV before doing an import on the local instance. I've tested this with my dev setup.

The data files have not been updated in a while (7 years for airlines.dat, 4 years for both planes.dat and airports.dat) so some entries are marked as "No matches, will be added as new" (e.g. Wizz Air UK).

Footnotes

  1. For example, https://github.com/jpatokal/openflights/blob/2f86a25d09c0721a2b37aa464bd7c474244fdeab/php/import.php#L497-L505 uses the Airline ID if present and otherwise falls back to string matching.

@2sylbl
Copy link

2sylbl commented Jun 24, 2023

See my comment on 992 - #992 (comment)

Users are allowed to enter their own plane designations so I’m curious as to how this might play out here as well. They may or may not confirm to existing ids, even if there were an Id column.

@chrisrosset
Copy link
Collaborator

It doesn't change the situation much here. I only mentioned the OIDs matching as an explanation for why this only happened to planes. You should wipe all the OIDs when moving backups across database instances.

For a local install, you have to clear that column (and the other OID columns as well). The import process will automatically create new entities for you if it can't match them based on the name.

@reedy
Copy link
Collaborator Author

reedy commented Jul 11, 2023

I note after deleting the databases, the updated imports etc.. Then importing my latest backup from openflights.org, the planes are wrong or mostly missing

@chrisrosset
Copy link
Collaborator

You need to delete the OIDs from the CSV when importing across instances to get the new instance to create whatever is not present in the database - try it. This only affects people moving data across instances.

@reedy
Copy link
Collaborator Author

reedy commented Jul 11, 2023

Aha, thanks. I wonder if this should almost be a difference between "backup" and "export"...

Probably should be documented somewhere (on the site) too

@chrisrosset
Copy link
Collaborator

Aha, thanks. I wonder if this should almost be a difference between "backup" and "export"...

I think exporting them is correct because the names can (and do, even if infrequently) change while the IDs are stable. This helps if your backup is not recent. I think this is WAD. Let's be honest, either you're using openflights.org or you're running your own instance. I think this is only an issue for us. :)

Probably should be documented somewhere (on the site) too

I'm not sure about the site itself since I don't believe this applies to users of openflights.org. How about the repo docs? You're probably going to read these if you're running your own instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug import Issues related to data import (e.g. CSV, FlightMemory).
Projects
None yet
Development

No branches or pull requests

3 participants