From 32ff9fdb4fd6cc09e93f7b21f9c1bb9f6a9f6d27 Mon Sep 17 00:00:00 2001
From: zoonalink <zoonalink@gmail.com>
Date: Wed, 28 Dec 2022 23:56:40 +0000
Subject: [PATCH] Task 2- dev report pasted in

---
 UFCFVQ-15-M_Programming_Task_2.ipynb | 53 ++++++++++++++++++++++++++--
 1 file changed, 51 insertions(+), 2 deletions(-)

diff --git a/UFCFVQ-15-M_Programming_Task_2.ipynb b/UFCFVQ-15-M_Programming_Task_2.ipynb
index d847f7b..edc89ba 100644
--- a/UFCFVQ-15-M_Programming_Task_2.ipynb
+++ b/UFCFVQ-15-M_Programming_Task_2.ipynb
@@ -1773,10 +1773,59 @@
    ]
   },
   {
+   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "add markdown text here"
+    "## Introduction\n",
+    "\n",
+    "The purpose of this report is to reflect on my code development process for Task 2 of the `UFCFQV-15-M Programming_for_Data_Science` coursework.\n",
+    "\n",
+    "## Code Description\n",
+    "\n",
+    "Task_2 requires undertaking a short 'data science' project, making use of Python libraries such as `pandas`, `numpy`, `matplotlib`, `seaborn` and `scipy`.  It involves importing two datasets, merging and cleaning data before analysis, including visualisation and appropriate statistical testing. The project is presented in a Jupyter notebook.  \n",
+    "\n",
+    "## Development Process\n",
+    "\n",
+    "As with Task_1, my development process roughly followed an iterative CRISP-DM approach.  \n",
+    "\n",
+    "Following the import, merge, filter, and clean tasks, I started the analysis.  This was very much in the <i>exploratory</i> spirit of Exploratory Data Analysis (EDA).  In hindsight, I would like to develop a more structured approach to my EDA.\n",
+    "\n",
+    "Being new to Python, it was a steep learning curve, where I often knew what I wanted to achieve but was not able to do it.  I also needed to spend time refreshing my use of statistical tests.\n",
+    "\n",
+    "## Code Evaluation\n",
+    "\n",
+    "The initial tasks (FR7-9) were straight-forward in that it was a case of following the instructions.  I did not encounter any problems with these tasks.  That said, I always wonder whether there is a <i>more efficient</i> or <i>standard</i> approach.  \n",
+    "\n",
+    "I used tools/skills I have encountered so far in terms of Python libraries – especially `pandas`.\n",
+    "\n",
+    "FR10-13 were much more challenging (and interesting) because they are not rigid tasks.  Whilst undertaking EDA, I found that I was simultaneously contributing to all four remaining FRs and that I would need to unpick my code for the purposes of the coursework later.\n",
+    "\n",
+    "As a result of significant EDA (mainly visual), I decided to further 'clean' and 'prepare' the data before presenting any visualisations for FR10.  I produced a lot of visualisations using both matplotlib and seaborn, experimenting with the many options, occasionally getting lost in the process yet always learning something new.\n",
+    "\n",
+    "I believe that my code is very thorough and shows helpful, relevant output.  I made it as readable as possible, commenting where necessary, including adding markdown cells to present the story of the data.  \n",
+    "\n",
+    "I would say that I may be presenting too much of similar visualisations - e.g., distributions.  This is partly because this is coursework (and the audience is not typical) but also because I am not sure, yet which visualisation is preferable.  \n",
+    "\n",
+    "FR11-13 are more succinct as I settled into an approach – style and convention (comments, naming, etc.) but still very thorough - i.e., I investigate using different statistical tests, visualisations and practiced git, markdown, etc.\n",
+    "\n",
+    "### Strengths\n",
+    "\n",
+    "* use of different libraries\n",
+    "* attempt to find a style / approach / format\n",
+    "* thoroughness\n",
+    "* achieving goals\n",
+    "\n",
+    "### Weaknesses\n",
+    "\n",
+    "* too many visualisations\n",
+    "* duplication - i.e. multiple similar visualisations or statistical tests (although in my defence, I want to learn and practice...)\n",
+    "\n",
+    "### Future Improvements\n",
+    "\n",
+    "* learning standards, best practice, efficiencies, more libraries\n",
+    "* balance code / comment / markdown / output\n",
+    "* gain confidence and skills\n"
    ]
   },
   {
@@ -1806,7 +1855,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.0"
+   "version": "3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)]"
   },
   "vscode": {
    "interpreter": {
-- 
GitLab