For this site to be useful, there needs to be a way to look at cars of a particular engine and trim, since these data affect fuel economy.
These data are very poor right now on Fuelly. For instance, for an acura integra, there are 4 submodels:
However, all the cars are in the other category, which is useless. 'Other' shouldn't be an option.
Continuing with the acura integra, the engine category as the following options:
1.6L GAS L4 1590cc (1)
1.7L GAS L4 1678cc (1)
1.8L GAS L4 1797cc (3)
1.8L GAS L4 1834cc (1)
GAS L4 (275)
The problems here are that 'GAS' and 'GAS L4' shouldn't be options. Furthermore, the distinction that matters with the integra is whether it has VTEC (Honda's variable valve timing) and this isn't listed next to the displacement. For other types of cars, 'turbo' needs to be listed next to the displacement.
My conclusion is that after you scrape these data from the net, you really need someone who knows cars to go through it by hand. There are just too many non-standard ways in which cars are named, marketed, and categorized. I could probably do it in a month, so I don't think the undertaking isn't worth your time.
On the prevous Fuelly, there was nowhere for users to break their car down to the submodel. So what you see as "Other" is really just the data that we had before. The 7 cars that you see set to a submodel are 7 cars that the owners have taken upon themselves to update since we updated to the new system. "Other" is the holding area for cars that have not been updated by the owners to be more specific down to the submodel.
We are still considering methods to get more of the vehicles classified into submodels.
At present, we need the owners of all cars to update their car to the correct submodel. We can not guess at the submodel of a car. There are over 200,000 cars in the system. We could manually move cars if the owner has placed the submodel as part of the car's title but that is a herculean task to do by hand.
The same applies to the engines. What you see as "GAS" and "GAS L4" are the engines that owners had specified previously. We can not assign them to the 1.6/1.7/1.8 options as we do not know which they actually have.
None of our data is being scraped from the internet. We are using an industry standard database with monthly updates. We are trying to shoehorn the wildwest naming style of the past Fuelly into this new system. The database is incomplete so we are having to fit the missing makes/models/submodels into the system and consider what to do with makes/models/submodels which don't seem to match what the owners believe they should, such as the Mini Coopers.
It is interesting to hear the details of your database.
I wasn't suggesting that you sort 200,000 cars by hand. However, the submodels you have listed currently as options are not correct. I don't think you can depend on your users to input the submodel from scratch, but you can give them a few options to select from.
Figuring out the 'correct' submodels for your users to select from is what I suggest doing by hand. I think the number of submodels for all the cars made in the last hundred years with production runs greater than 1000 might be only 1000-5000, which can, and in my option, should, be done by hand.
Perhaps we are crossing wires here. The submodels that you see on the browse cars page are only those submodels that have been chosen by drivers from our database. If there are other submodels available but haven't been chosen, then they won't appear here. You can pick a particular year and then go through the steps to add that vehicle to see what submodels are available, which could very well be more than you see on the browse pages.
There are cases where this database does not have a submodel, which it should, for a particular year but it does cover most.
The submodels that you see on the browse cars page are only those submodels that have been chosen by drivers from our database. If there are other submodels available but haven't been chosen, then they won't appear here.
I think I understand this to be exactly the problem. The users don't really know how to describe their submodel, so they enter submodels that don't make sense. This then complicates your data.
A good example is the 'mini cooper S with the chilli pack'. It's just a 'mini cooper S' The Chilli pack doesn't really matter for mpg and so the user should not be allowed to enter chilli pack as part of their submodel description/name.
They should look up their year, see only a finite number of options including the base cooper, cooper S, and john cooper works. Then if they have the cooper S with Chilli Pack, they would just choose cooper S, since it's the closest to what they think they have.
That's the part that should be done by hand by fuelly, entering the model and submodel. If you leave the submodel description to the users, you'll only get a mess. For rare cars, you'll have to double check information from user's requests to add models, but still, there aren't THAT many types of cars...
Submodels that users enter their self do not display on the browse vehicle pages.
Only the submodels that are in the database, either by default, or that we, as administrators add, and are then chosen by a user, will display on the browse vehicle pages. We do not enter any submodels until we research their validity.
The "Chili Pack" submodel wasn't created by any user of Fuelly. The mini cooper submodels are exactly how they exist in the database as purchased. By far, the submodels in the database do appear appropriate. The mini cooper is one of the few anomalies.