How analyzing Airbnb data make me excited to visit Málaga?

Mahmoud Ahmed
5 min readApr 3, 2020

Introduction

It starts by writing during taking the Data Scientist Nanodegree Program from Udacity, and is required to show analysis work with data like Airbnb data, which is a famous platform for arranging and offering stays, is also very prominent in the whole world, and I chose Spain to be my start country, but when I found Málaga city data is available, I chose it directly because I listened a lot about it, and interesting to know more about accommodation in it, and to be honest I surprised from the results. Today in this article I’ll present some of these insights.

Some of the questions I always as ask myself in planning for any new trip and here I assign it to Málaga city:

  • What is the perfect time of year to visit Málaga?
  • Which neighborhood is more affordable and less crowded to accommodate?
  • Which features are more involved in predicting hosts’ price?

and as Data Scientist what if I used data science to answer the third question especially in trying to predict properties’ prices in Málaga. let me inform you about the results.

What is the perfect time of year to visit Málaga?

This question can be split into two parts: Time and Price, for the time part I calculated the average day availability for each month. As a result, and I found that Málaga is fairly active most of the year, but mostly in these three months, May, July, and August. On the other hand, January, February, and March are less active.

Fig [1] availability percentage per month in Málaga city

Then, I explored more to know what are the suitable days for active months, and I found the end of each month is where most people visit Málaga, even for inactive months.

Fig [2] activity percentage per active month day in Malaga city
Fig [2] availability percentage per active month day in Málaga city
Fig [3] activity percentage per inactive month day in Malaga city
Fig [3] availability percentage per inactive month day in Málaga city

For the price part, I calculated the average price for each month, and find Prices increase in the Winter season, mostly in December, January, and February. On the other hand, they decrease in the Spring season in March, April, and May.

Fig [4] Prices per month in Malaga city
Fig [4] Prices per month in Málaga city

So I can say it clearly:

“March is the less crowded and most affordable time in year to visit Málaga”

Which neighborhood is more affordable and less crowded?

No one wants to spend his visit time in a crowded neighborhood or pay all year savings to accommodate in a hotel for days, so here I explored Málaga neighborhoods from Price and Crowding views.

To determine which neighborhoods are expensive, I calculated the average price for each neighborhood group. As a result, I found Churriana, Centro and Este are the most expensive. While Cruz De Humilladero, Ciudad Jardin and Bailen-Miraflores are the cheapest.

Fig[5] Average Price per each neighborhood
Fig[5] Average Price per each neighborhood

From the crowding perspective, neighborhoods are crowded, I calculated the average number of listings or accommodations that happened in each neighborhood and found that Campanillas, Cruz De Humilladero, and Ciudad Jardin are the most crowded neighborhoods. Este, Puerto de la Torre, and Teatinos-Universidad are the least crowded neighborhoods.

Fig [6] Average number of listings per neighborhood
Fig [6] Average number of listings per neighborhood

So, Ciudad Jardin, Cruz De Humilladero are cheap neighborhoods but it’s a crowded then, No specific neighborhood can I say it’s the cheap and have a less population but I can say it’s too easy to find good price hosts with acceptable crowding level and paying a lot of money isn’t the only reason to avoid crowding if find a calm place.

Which features are more involved in predicting price?

I tried to use data science to build a regression model able to take some property features and predict an estimated price value for the property, as everyone wants to find the best price value that satisfies the property value. But still, also important to ask is what is factors which I can say “this property is suitable”.

After developing the random forest regression model to predict value price I found these features are the most important factors can affect the model and found there:

  • Property type.
  • Count of listings or reservations for the property.
  • Count of available days listed.
  • Is a host verified?
  • The listing was reservations last week.
  • Count of reviews on the property.
  • Count of people to be accommodated.
  • Availability to have extra people.
  • The cost of cleaning service.

Conclusion

In this article, we took a look at Málaga Airbnb Data to answer some questions, that might help travelers and hosts also in making better decisions.

The main insights are:

  • May, July, and August are the most active months in the year and January, February, and March are less active.
  • Prices increase in the winter season and decrease in the spring season.
  • Churriana, Centro and Este are the most expensive.
  • Cruz De Humilladero, Ciudad Jardin and Bailen-Miraflores are the cheapest.
  • Campanillas, Cruz De Humilladero, and Ciudad Jardin are the most crowded neighborhoods.
  • Este, Puerto de la Torre, and Teatinos-Universidad are the least crowded neighborhoods.
  • The property type, the host and property history, visitors’ reviews and the cost of cleaning service is the most important factor in determining property price.

If you’re curious about the details of this analytics and want to know in deep don’t hesitate to jump in my GitHub repo.

Finally, Thanks to the good weather all year and the high temperature in Summer, Málaga is becoming the first choice for beach lovers or nightlife. Málaga is worth visiting it’s the gate for Andalucía that is the old Islamic heritage, and Southern Spain’s most popular region.

--

--

Mahmoud Ahmed

Data Engineer passionate about solving real-world problems through ML & data engineering techniques. join me on www.mahmoudahmed.dev