diff --git a/README.md b/README.md index f77f3a47de3f89cde6e7fc016a073ee57ca11d2d..2abee45cb3c65f43f0808827e52632ff876d3f5a 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,23 @@ # Programming for Data Science | UFCFVQ-15-M (S1-22) -## ID : 23003188 | Name : Wassem Adel Alaa Iddin \ No newline at end of file +## ID : 23003188 | Name : Wassem Adel Alaa Iddin + + +## Available Files + +- 2 Working template +- 4 pre-given datasets and one created dataset called (update.csv) + + + +# Explanations + + +1. Task 1 (UFCFVQ-15-M Programming Task 1.ipynb) + + The tasks in this section can be solved using mathematical equations and pre-built functions in Python. The code is organized in the task1.dot file. One weakness of the code is that it relies heavily on pre-built functions and does not make use of libraries such as CSV and Pandas, which could potentially make the code shorter and more efficient. There are also opportunities for improvement by updating the tasks to be solved using pure Python and by improving the custom table with a more efficient algorithm and better formatting. + + +2. Task 2 (UFCFVQ-15-M Programming Task 2.ipynb) + + The tasks in this section involve basic data cleaning and merging, and can be easily solved using the pandas library. Task 2.10 uses a bar plot to visualize the relationship between the age group and other columns. Task 2.11 uses a heatmap and a regplot to show the correlation between two columns. Task 2.12 uses a p-value to determine the statistical significance of the relationship. One weakness of this section is that the correlation plot could be improved with more data visualization. There are also opportunities for improvement by writing better code structure and adding more graphs to explain the dataset in detail.