Berlin used to be an affordable place not so long ago. When I moved to Berlin, my uncle was paying around 580 euros for a one-bedroom apartment in the Mitte neighborhood. From there, you could walk through history, art, and culture without even taking a bus. Things have changed since then. Berliners not only need to compete between thousands of applications to find a place, but also they need to pay hefty rents to secure what they want.
I want to investigate this matter and answer some questions. More importantly, I decided to make a machine learning model predicting the rent prices in Berlin using different features. Since there are not many public datasets about berlin house prices available out there, I used a kaggle dataset. This dataset contains apartment rental offers in Germany from the dates 2018-09-22, 2019-05-10, and 2019-10-08 (Collected from immobilienscout24). I filtered out Berlin listings to work on it exclusively.
The berlin dataset is available on my Github.
- Gathered berlin listings from immobilienscout24, validated and cleaned data
- Performed an exploratory data analysis (plus AutoEDA) and answered different questions about the berlin real state market
- Built machine learning models to predict rent prices in berling using regression techniques
pandasnumpymatplotlibseabornscikit-learnmissingno
AutoML by:
pandas_profilingAutoViz
The variables of this dataset are:
regio1- bundeslandserviceCharge- aucilliary costs such as electricty or internet in €heatingType- type of heatingtelekomTvOffer- is payed TV included if so which offertelekomHybridUploadSpeed- how fast is the hybrid inter upload speednewlyConst- is the building newly constructedbalcony- if a listing has a balconypicturecount- how many pictures were uploaded to the listingpricetrend- price trend as calculated by ImmoscouttelekomUploadSpeed- how fast is the internet upload speedtotalRent- total rent (usually a sum of base rent, service charge and heating cost)yearConstructed- construction yearscoutId- immoscout IdnoParkSpaces- number of parking spacesfiringTypes- main energy sources, separated by colonhasKitchen- has a kitchengeo_bln- bundesland (state), same as regio1cellar- has a cellaryearConstructedRange- binned construction year, 1 to 9baseRent- base rent without electricity and heatinghouseNumber- house numberlivingSpace- living space in sqmgeo_krs- district, above ZIP codecondition- condition of the flatinteriorQual- interior qualitypetsAllowed- are pets allowed, can be yes, no or negotiablestreet- street namestreetPlain- street name (plain, different formating)lift- is elevator availablebaseRentRange- binned base rent, 1 to 9typeOfFlat- type of flatgeo_plz- ZIP codenoRooms- number of roomsthermalChar- energy need in kWh/(m^2a), defines the energy efficiency classfloor- which floor is the flat onnumberOfFloors- number of floors in the buildingnoRoomsRange- binned number of rooms, 1 to 5garden- has a gardenlivingSpaceRange- binned living space, 1 to 7regio2- District or Kreis, same as geo krsregio3- City/towndescription- free text description of the listingfacilities- free text description about available facilitiesheatingCosts- monthly heating costs in €energyEfficiencyClass- energy efficiency class (based on binned thermalChar, deprecated since Feb 2020)lastRefurbish- year of last renovationelectricityBasePrice- monthly base price for electricity in € (deprecated since Feb 2020)electricityKwhPrice- electricity price per kwh (deprecated since Feb 2020)date- time of scraping
