Case study
Research the feasibility of a function that shall recognize the intent to drive 30-180 seconds before the driver returns to the vehicle using the data from a user’s personal mobile phone.
Industry: Automotive
Customer: Western Europe OEM
Research the feasibility of a function that shall recognize the intent to drive 30-180 seconds before the driver returns to the vehicle using the data from a user’s personal mobile phone.
Industry: Automotive
Customer: Western Europe OEM
Competence domains:
- Data analysis & AI solutions
- IoT
Challenge
The customer wants to develop a system, that detects a user’s intention to drive 30-180 seconds before they enter the car. This allows for the possibility to precondition the vehicle to avoid having the driver wait for the systems, like the head unit or navigation, to restart when they enter the vehicle.
The complexity of the problem was that using just a GPS signaldid not fully solve this problem in full due to its accuracy issues, specifically when vehicle was parked in the same building where the user was located, e.g. garage in private house. In addition it needed to avoid “False Positives”, when a user simply walks by the car without the intention to drive.
Solution
Data collection application development
To solve this problem we developed a iOS application. The functionality of the app included a collection of the signals (accelerometer, gyroscope, magnetometer, barometer, location, motion activity, Wi-Fi, Bluetooth and battery consumption etc.) their transferring to the server side, registration routines and use case check-in (the ability of the driver to describe the environment). Overall, around 60 signals were collected that generate about 650 different features.
Such full-fledged data collection significantly affected the battery consumption, therefore we took a number of optimization steps, such as dynamically reducing the frequency and types of signals depending on the context. This allowed to reduce battery consumption to an acceptable level.
External data collection
To validate our solution in a real environment we’ve run data collection programs on external users in 5 different countries in Europe and North America. The user’s choice was based on their regular parking behavior to cover different use-cases: living in a private house or apartment, using underground parking, where GPS signals are not available, etc.
Data preparation and data quality control
Once we succeeded in building the app to server pipeline and collecting data we aimed to, we shifted our focus to the new need of monitoring the data quality. Here we used two different approaches. The first one was about checking the quantity of collected signals and the second involved fitting numeric ranges, that are within common sense limits. The noted procedure allowed us to react immediately in case we had data quality issues with specific users or devices.
Data pre-processing
Activity information is difficult to extract from the raw signals. This is due to the noise (measurement error) included into each signal and due to the complexity of the activity process. Walking, for example, cannot be approximated just by acceleration, because it will be indistinguishable from going up- or downstairs. That is why a detailed view at the different axes of the accelerometer, gyroscope and other sensors is needed to identify activity.
To speed up the learning process we also applied clustering to group the parking instances and build dimension reduction system for each cluster. This allows us not only reduce the overall number of generated features but also drop the collection of unnecessary signals, that positively impacts on app’s battery consumption. For example, the barometer signal is important if the user is located in multi-story building but provides no additional information for private house location.
Learning
The solution was based on the combination of different tree based systems (Random Forest, LGBM) and deep learning (CNNs). Each separate approach has different learning time and sensitivity to data quality and the combination provided the best results.
Significant improvement in the results was achieved through carefully selecting the parameters and a meticulous approach to feature extraction. In particular, in addition to different cleaning and normalization approaches in order to catch the cyclicity of the data, we applied Fourier Transformation to initial time series signals.
Validation
To measure the quality of the result we built a set of metrics, that provides detailed analysis of different measurements (precision, recall, learning curve, etc.) for different groups of users and different types of activity-location zones.