On Christmas 2023, to clear my mind, I decided to take a break from Python and R and try something new: I completed the Introduction to Julia course in Datacamp.
The course is really good and easy to follow for an absolute beginner like I was, but all the exercises are run within the perfect enviroment of the course provider. It is later, when you try on your own PC, when the problems arise.
First you have to install the Julia language in your computer (instructions in the Julia website). After that, you can install any packages you may need by pressing the key ] when in the Julia REPL and then writing add “whatever” . For example, if you want to write your Julia code in Jupyter notebooks; you will need to install the IJulia pakage.
For me, with all the data analysis languages and tools, the trickiest part is loading the data into the working space. So, after finding out a way to do it, I saved the code here for future reference. Feel free to copy & use it.
Julia lessons for Inma in Jupyter¶
- How to retrieve a .csv data file from the web.
Imagine you want to analyse the dataset from the Public Health Scotland website: https://www.opendata.nhs.scot/dataset/c4db1692-fa02-4a1c-af4c-6039c74633ea/resource/29452b1f-a7be-4e93-9e22-dfa120c2df26 (contains data on alcohol related hospital statistics by sex and age group).
For a reproducible analysis, the best way is to retrieve the data directly from the web without downloading the file into your computer (Althought the data could be updated by the data owner and the subsecuents analysis will differ slightly).
import DataFrames, CSV, HTTP
# read csv file
path = ("https://www.opendata.nhs.scot/dataset/c4db1692-fa02-4a1c-af4c-6039c74633ea/resource/29452b1f-a7be-4e93-9e22-dfa120c2df26/download/arhs_agegender_28_02_2023.csv")
data = HTTP.get(path).body
csv_file = CSV.File(data)
df = DataFrames.DataFrame(csv_file)
# Select the columns we are interested in
alcohol_stays = DataFrames.select(df, :Condition, :FinancialYear, :Gender, :AgeGroup, :NumberOfStays)
# select data from 2021/22
alcohol_stays_2021 = DataFrames.filter(row -> row.FinancialYear == "2021/22", alcohol_stays)
# print the first two rows
println(first(alcohol_stays_2021, 2))
2×5 DataFrame Row │ Condition FinancialYear Gender AgeGroup NumberOfStays │ String String7 String7 String31 Int64 ─────┼───────────────────────────────────────────────────────────────────────── 1 │ All alcohol conditions 2021/22 Male All 22728 2 │ All alcohol conditions 2021/22 Female All 10332
using DataFrames, CSV, HTTP
# read csv file
path = ("https://www.opendata.nhs.scot/dataset/c4db1692-fa02-4a1c-af4c-6039c74633ea/resource/29452b1f-a7be-4e93-9e22-dfa120c2df26/download/arhs_agegender_28_02_2023.csv")
data = HTTP.get(path).body
csv_file = CSV.File(data)
df = DataFrame(csv_file)
# Select the columns we are interested in
alcohol_stays = select(df, :Condition, :FinancialYear, :Gender, :AgeGroup, :NumberOfStays)
println(first(alcohol_stays,10))
#select data from 2020/21
alcohol_stays_2019 = filter(row -> row.FinancialYear == "2019/20", alcohol_stays)
println(first(alcohol_stays_2019,2))
10×5 DataFrame Row │ Condition FinancialYear Gender AgeGroup NumberOfStays │ String String7 String7 String31 Int64 ─────┼───────────────────────────────────────────────────────────────────────── 1 │ All alcohol conditions 1997/98 Male All 21462 2 │ All alcohol conditions 1997/98 Female All 8232 3 │ All alcohol conditions 1998/99 Male All 21930 4 │ All alcohol conditions 1998/99 Female All 8637 5 │ All alcohol conditions 1999/00 Male All 23637 6 │ All alcohol conditions 1999/00 Female All 8955 7 │ All alcohol conditions 2000/01 Male All 23337 8 │ All alcohol conditions 2000/01 Female All 9051 9 │ All alcohol conditions 2001/02 Male All 24474 10 │ All alcohol conditions 2001/02 Female All 9846 2×5 DataFrame Row │ Condition FinancialYear Gender AgeGroup NumberOfStays │ String String7 String7 String31 Int64 ─────┼───────────────────────────────────────────────────────────────────────── 1 │ All alcohol conditions 2019/20 Male All 25203 2 │ All alcohol conditions 2019/20 Female All 11340
www.inmaruiz.com
Sponsored content
Our AI marketing advertisement engine has decided to place this advert here. I don´t know why ¯\_(ツ)_/¯ never mind, it must be the right one as we all know that the AI is cleverer than us.