diff --git a/2022-12-18-23-11-02.png b/2022-12-18-23-11-02.png
new file mode 100644
index 0000000000000000000000000000000000000000..c0a39e1203baf4966d6c44f95f1e7795cbbc44e5
Binary files /dev/null and b/2022-12-18-23-11-02.png differ
diff --git a/Task1 reflect notes-DELETE.md b/Task1 reflect notes-DELETE.md
deleted file mode 100644
index 8947e190013ebec5f101e3899c88db5938c56ce4..0000000000000000000000000000000000000000
--- a/Task1 reflect notes-DELETE.md	
+++ /dev/null
@@ -1,11 +0,0 @@
-# file to store 'development process report' notes for task 1
-
-* pleasure of the process; also frustrating
-* process - crisp-dm - iterative
-* related to work - > importance of requirements and flexing, agility
-* pseudo code - getting into habit has been very useful (have a tendency, which I need to fight, to dive into the deep end and just manage to keep afloat)
-* getting into git (hub, lab) - but have been frustrated (authentication with gitlab - needing to switch to https from ssh, despite seemingly set up correctly)
-* single powerful tool, work environment - vs code - revolutionary
-* documentation - robust, standardised, conventions - i.e. commenting
-* intersections of domain, maths, coding
-* not enough time
diff --git a/UFCFVQ-15-M_Programming_Task_1.ipynb b/UFCFVQ-15-M_Programming_Task_1.ipynb
deleted file mode 100644
index a3e500ab36a8610a5541368e8eb7537e5ffc4e59..0000000000000000000000000000000000000000
--- a/UFCFVQ-15-M_Programming_Task_1.ipynb
+++ /dev/null
@@ -1,543 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# UFCFVQ-15-M Programming for Data Science (Autumn 2022)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">OVERALL COURSEWORK MARK: ___%</p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### GitLab link submission, README.md file and Git commit messages\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# Programming Task 1\n",
-    "\n",
-    "## Student Id: 05976423"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR1 - Develop a function to find the arithmetic mean"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "\n",
-    "$$\n",
-    "{\\displaystyle A={\\frac {1}{n}}\\sum _{i=1}^{n}a_{i}={\\frac {a_{1}+a_{2}+\\cdots +a_{n}}{n}}}\n",
-    "$$\n",
-    "\n",
-    "##### <b>`Pseudocode / Plan:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> list of values\n",
-    "\n",
-    "<b>`Steps:` </b>\n",
-    "\n",
-    "1. sum list values\n",
-    "2. divide sum of list values by count (len) of list of values\n",
-    "3. return result\n",
-    "4. add try except block\n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* arithmetic mean"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "47.62"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# list of numbers\n",
-    "list = [85, 29, 35, 55, 82, 45, 42, 21, 42, 60, 56, 30, 72, 56, 37, 65, 29, 14, 66,  43, 23, 39, 81, 56, 74, 29, 22, 27, 14, 66, 55, 33, 31, 66, 63, 41, 30, 48, 68, 58, 51, 44, 66, 34, 20, 71, 59, 57, 43, 48]\n",
-    "\n",
-    "\n",
-    "\n",
-    "def mean(list):\n",
-    "    '''\n",
-    "    Function to calculate arithmetic mean - i.e. sum of data divided by number of data points.\n",
-    "    '''\n",
-    "    try:\n",
-    "        #print(sum(list) / len(list))\n",
-    "        return sum(list) / len(list)\n",
-    "    except ZeroDivisionError:\n",
-    "        print(\"Error: Division by zero. List is empty\")\n",
-    "    except TypeError:\n",
-    "        print(\"Error: Invalid typ in list. List must contain only numbers\")\n",
-    "    except:\n",
-    "        print(\"Error with list of numbers. Please check list\")\n",
-    "mean(list)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR2 - Develop a function to read a single column from a CSV file"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "\n",
-    "##### <b>`Pseudocode / Plan:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> filename/path\n",
-    "\n",
-    "<b>`Parameters:` </b>\n",
-    "\n",
-    "* file - name, path\n",
-    "* column to read -  column number (0 - n-1, where n is number of columns)\n",
-    "* delimiter - assume comma (,), but allow for other separator, e.g. /t (tab), | (pipe), etc.\n",
-    "* header - True / False, default = True\n",
-    "\n",
-    "<b>`Steps:` </b>\n",
-    "\n",
-    "1. Open file (with open, so autocloses file)\n",
-    "2. if / else for header status\n",
-    "3. extract column name as value\n",
-    "4. iterate over remaining lines with list comprehension to return list of values in column\n",
-    "5. return column name and values or just values\n",
-    "6. add try except block\n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* If header = True, return column name as value and list of data, else return only list of values\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "\n",
-    "def data_from_column(filename, columnNum, delimiter = ',', header = True):\n",
-    "    '''\n",
-    "    Function to read csv file and return a list of data from a specified column (by column number 0 to n-1).  Default delimiter is comma.  If header is True, the function will return the column name and the data in a list.  If header is False, the function will return the data in a list.\n",
-    "    '''\n",
-    "    try:\n",
-    "        with open(filename) as openFile:\n",
-    "            if header == True:\n",
-    "                colName = next(openFile).split(delimiter)[columnNum]  \n",
-    "                data = [line.split(delimiter)[columnNum] for line in openFile]              \n",
-    "                return  colName, data\n",
-    "            else:\n",
-    "                return [line.split(delimiter)[columnNum] for line in openFile]\n",
-    "    except FileNotFoundError:\n",
-    "        print(\"Error: File not found. Please check file name, extension and path\")\n",
-    "    except:\n",
-    "        print(\"Error with file. Please check file\")\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Return column names and data from csv file with header row\n",
-    "column, data = data_from_column('task1.csv', 1, header = True)\n",
-    "print(column)\n",
-    "print(data)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Demo where returns only data (assuming file has no header row) - note task1.csv has a header row so 'pop' included in list\n",
-    "data = data_from_column('task1.csv', 1, ',', header = False)\n",
-    "print(data)\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR3 - Develop a function to read CSV data from a file into memory"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "##### <b>`Pseudocode:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> filename/path\n",
-    "\n",
-    "<b>`Parameters:` </b>\n",
-    "\n",
-    "* file\n",
-    "* column to read, by column number (0 - n-1, where n is number of columns)\n",
-    "* delimiter - assume comma (,), but allow for other separator, e.g. /t (tab), | (pipe), etc.\n",
-    "* header - True / False, default = True\n",
-    "  \n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* If header = True, return column name as value and list of data, else return only list of values\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 39,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def read_csv_into_dictionary(filename, delimiter = ','):\n",
-    "    '''\n",
-    "    Function to read csv file and return a dictionary with column names as keys and data as values. Default delimiter is comma.  Assumes that the first line of the file contains the column names (i.e. header).\n",
-    "    '''\n",
-    "    try:\n",
-    "        with open(filename) as openFile:\n",
-    "            colNames = next(openFile).split(delimiter)\n",
-    "            data = [line.split(delimiter) for line in openFile]\n",
-    "            return {colNames[i]: [row[i] for row in data] for i in range(len(colNames))}\n",
-    "\n",
-    "    except FileNotFoundError:\n",
-    "        print(\"Error: File not found. Please check file name, extension and path\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "read_csv_into_dictionary('task1.csv')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR4 - Develop a function to calculate the Pearson Correlation Coefficient for two named columns"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "##### `Pearson's R formulae`\n",
-    "\n",
-    "Given paired data, consisting of $n$ pairs: \n",
-    "$$\n",
-    "\n",
-    "\\displaystyle\n",
-    "r =\n",
-    "  \\frac{ \\sum_{i=1}^{n}(x_i-\\bar{x})(y_i-\\bar{y}) }{%\n",
-    "        \\sqrt{\\sum_{i=1}^{n}(x_i-\\bar{x})^2}\\sqrt{\\sum_{i=1}^{n}(y_i-\\bar{y})^2}}\n",
-    "$$\n",
-    "\n",
-    "\n",
-    "\n",
-    "Pearson's correlation coefficient measures linear association between two variables\n",
-    "* range of -1 to 1\n",
-    "* -1 = perfect negative linear correlation\n",
-    "* 1 = perfect positive linear correlation\n",
-    "\n",
-    "source: [Wikipedia](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#Definition)\n",
-    "\n",
-    "\n",
-    "##### <b>`Pseudocode:`</b> \n",
-    "\n",
-    "<b>`Input:`</b> Two named columns\n",
-    "\n",
-    "Question: \n",
-    "- is this reading a csv and specifying columns by name as function parameters? \n",
-    "- or somehow return available columns and allowing user to input which ones to use?\n",
-    "- or by column number - maybe combination of both?? \n",
-    "  - tell user columns with col number and ask to input two numbers??\n",
-    "- maybe I could use above function x2 - choose col 1, then choose col 2\n",
-    "\n",
-    "\n",
-    "<b>`Parameters:` </b>\n",
-    "\n",
-    "* file??\n",
-    "* columns x2 - which need to be broken into colname & values\n",
-    "\n",
-    " \n",
-    "\n",
-    "<b>`Output:`</b>\n",
-    "\n",
-    "* variable 1, variable 2, r (Pearson's correlation coefficient)\n",
-    "\n",
-    "<b>`Steps: `</b>\n",
-    "\n",
-    "1. get data - use FR2, FR3 - to get list of values (x, y)\n",
-    "2. calculations needed for Pearson's r:\n",
-    "   1. mean of x values and mean of y values -> use FR1 mean(list)\n",
-    "   2. for each x, x-mean(x)\n",
-    "   3. for each y, y-mean(y)\n",
-    "   4. for each row (x-mean(x))*(y-mean(y))\n",
-    "   5. for each x, (x-mean(x))**2\n",
-    "   6. for each y, (y-mean(y))**2\n",
-    "3. sums of calculations\n",
-    "   1. sum of 2.4 -> sum of [for each row (x-mean(x))*(y-mean(y))]\n",
-    "   2. sum of 2.5 -> sum of [for each x, (x-mean(x))**2]\n",
-    "   3. sum of 2.6 -> sum of [for each y, (y-mean(y))**2]\n",
-    "4. calculate r\n",
-    "   1. r = sum of [for each row (x-mean(x))*(y-mean(y))] / sqrt((sum of [for each x, (x-mean(x))**2]) * (sum of [for each y, (y-mean(y))**2]))\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "# replace with your code"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR5 - Develop a function to generate a set of Pearson Correlation Coefficients for a given data file "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "# replace with your code"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### <b> delete </b>\n",
-    "\n",
-    "* output (see appendix 4)",
-    "  * input: \n",
-    "  * print list of columns (variables) available in csv/data??\n",
-    "  * user input by name?? or colnumber??  confirmation??\n",
-    "\n",
-    "* output: \n",
-    "  * return results as input, with nicely bordered tables - old-school dot-matrix print outs (like r)\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "### Requirement FR6 - Develop a function to print a custom table"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "deletable": false
-   },
-   "outputs": [],
-   "source": [
-    "# replace with your code"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# Coding Standards\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "# Process Development Report for Task 1\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "add markdown text here"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "deletable": false
-   },
-   "source": [
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
-    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3.11.0 64-bit",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.0"
-  },
-  "vscode": {
-   "interpreter": {
-    "hash": "3a85823825384e2f260493b9b35c69d8eaac198ff59bb0d6c0e72fffbde301e2"
-   }
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
diff --git a/UFCFVQ-15-M_Programming_Task_1_submit.ipynb b/UFCFVQ-15-M_Programming_Task_1_submit.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..084d3834962d26a427d1a9cc6b4dc87a41bf353c
--- /dev/null
+++ b/UFCFVQ-15-M_Programming_Task_1_submit.ipynb
@@ -0,0 +1,626 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "# UFCFVQ-15-M Programming for Data Science (Autumn 2022)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">OVERALL COURSEWORK MARK: ___%</p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "### GitLab link submission, README.md file and Git commit messages\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "# Programming Task 1\n",
+    "\n",
+    "## Student Id: 05976423"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "### Requirement FR1 - Develop a function to find the arithmetic mean"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "deletable": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "47.62"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# list of numbers\n",
+    "list = [85, 29, 35, 55, 82, 45, 42, 21, 42, 60, 56, 30, 72, 56, 37, 65, 29, 14, 66,  43, 23, 39, 81, 56, 74, 29, 22, 27, 14, 66, 55, 33, 31, 66, 63, 41, 30, 48, 68, 58, 51, 44, 66, 34, 20, 71, 59, 57, 43, 48]\n",
+    "\n",
+    "def FR1_mean(list):\n",
+    "    '''\n",
+    "    Function to calculate arithmetic mean - i.e. sum of data divided by number of data points.\n",
+    "    '''\n",
+    "    try:\n",
+    "        #print(sum(list) / len(list))\n",
+    "        return sum(list) / len(list) # sum of list divided by number of elements in list\n",
+    "    except ZeroDivisionError:\n",
+    "        print(\"Error: Division by zero. List is empty\")\n",
+    "    except TypeError:\n",
+    "        print(\"Error: Invalid type in list. List must contain only numbers\")\n",
+    "    except:\n",
+    "        print(\"Error with list of numbers. Please check list\")\n",
+    "FR1_mean(list)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "### Requirement FR2 - Develop a function to read a single column from a CSV file"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "deletable": false
+   },
+   "outputs": [],
+   "source": [
+    "\n",
+    "def FR2_data_from_column(filename, columnNum, delimiter = ',', header = True):\n",
+    "    '''\n",
+    "    Function with two mandatory parameters (filename and column number (0 to n-1)) and two optional parameters (delimiter and header).  The function will return a list of data from a specified column and the column name (if header is True).  If header is False, the function will return only the list of data.\n",
+    "    '''\n",
+    "    try:\n",
+    "        with open(filename) as openFile:\n",
+    "            if header == True:\n",
+    "                variable = next(openFile).split(delimiter)[columnNum]  \n",
+    "                data = [line.split(delimiter)[columnNum] for line in openFile]              \n",
+    "                return  variable, data\n",
+    "            else:\n",
+    "                return [line.split(delimiter)[columnNum] for line in openFile]\n",
+    "    except FileNotFoundError:\n",
+    "        print(\"Error: File not found. Please check file name, extension and path\")\n",
+    "    except:\n",
+    "        print(\"Error with file. Please check file\")\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "age\n",
+      "['16', '27', '26', '25', '29', '29', '22', '35', '44', '31']\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Test function specifying task1.csv and first column. Header is True by default, so data and variable (column name) are returned.\n",
+    "variable, data = FR2_data_from_column('task1.csv', 0)\n",
+    "print(variable)\n",
+    "print(data[0:10]) # print first 10 elements of data list"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "### Requirement FR3 - Develop a function to read CSV data from a file into memory"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def FR3_read_csv_into_dictionary(filename, delimiter = ','):\n",
+    "    '''\n",
+    "    Function with one mandatory parameter (filename) to read csv file and return a dictionary with column names as keys and data as values. Default delimiter is comma.  Assumes that the first line of the file contains the column names (i.e. header) which become the dictionary keys.\n",
+    "    '''\n",
+    "    try:\n",
+    "        with open(filename) as openFile: # open file\n",
+    "            variable = next(openFile).split(delimiter) # read first line and split into list - to get dictionary keys\n",
+    "            data = [line.split(delimiter) for line in openFile] # read remaining lines and split into list of lists - to get corresponding dictionary values\n",
+    "            variable_data_dict = {variable[i]: [float(row[i]) for row in data] for i in range(len(variable))} # create dictionary with keys and values (as float) by iterating through variable list and data list\n",
+    "            return variable_data_dict\n",
+    "\n",
+    "    except FileNotFoundError:\n",
+    "        print(\"Error: File not found. Please check file name, extension and path\")\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[16.0, 27.0, 26.0, 25.0, 29.0, 29.0, 22.0, 35.0, 44.0, 31.0]\n"
+     ]
+    }
+   ],
+   "source": [
+    "my_dict = FR3_read_csv_into_dictionary('task1.csv')\n",
+    "print(my_dict['age'][0:10]) # print first 10 elements of 'age' column"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "### Requirement FR4 - Develop a function to calculate the Pearson Correlation Coefficient for two named columns"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def FR4_pearsonCorrCoef(x, y):\n",
+    "    '''\n",
+    "    Function to calculate the Pearson Correlation Coefficient (PCC), often represented by the letter 'r'.  PCC is a measure of linear correlation between two variables between -1 and 1.  A value of 1 indicates a perfect positive linear relationship; a value of -1 indicates a perfect negative linear relationship; and a value of 0 indicates no linear relationship.  \n",
+    "\n",
+    "    The function takes two lists of numbers as input and returns a single value - the Pearson Correlation Coefficient.  The function will return None if the lists are not the same length or if the lists contain non-numerical values.\n",
+    "    \n",
+    "    '''\n",
+    "    # Check that x and y are lists of numbers of same length\n",
+    "    try:\n",
+    "        assert type(x) == type([]) \n",
+    "        assert type(y) == type([])\n",
+    "        assert len(x) == len(y)\n",
+    "        assert len(x) > 0\n",
+    "    except AssertionError:\n",
+    "        print(\"Error: x and y MUST be same-length lists of only numbers in order to calculate Pearson's Correlation Coefficient\")\n",
+    "        return None\n",
+    "\n",
+    "    # Calculate mean of x and y\n",
+    "    avg_x = FR1_mean(x)\n",
+    "    avg_y = FR1_mean(y)\n",
+    "\n",
+    "    # Calculate standard deviation of x and y\n",
+    "    stdx = (sum([(x-avg_x)**2 for x in x]) / len(x)) ** 0.5\n",
+    "    stdy = (sum([(y-avg_y)**2 for y in y]) / len(y)) ** 0.5\n",
+    "\n",
+    "    \n",
+    "    # returns list of tuples with x, y and PCC values if required\n",
+    "    #PCCs = [(x[i] - avg_x) * (y[i] - avg_y) / (stdx * stdy) for i in range(len(x))]\n",
+    "    #return [(x[i],y[i],PCCs[i]) for i in range(len(x))]\n",
+    "\n",
+    "    # Calculate Pearson Correlation Coefficient for lists x and y\n",
+    "    r = FR1_mean([(x[i] - avg_x) * (y[i] - avg_y) for i in range(len(x))]) / (stdx * stdy)\n",
+    "\n",
+    "    return r\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "0.8984458631125747\n",
+      "[[1.         0.89844586]\n",
+      " [0.89844586 1.        ]]\n"
+     ]
+    }
+   ],
+   "source": [
+    "#testing FR4_pearsonCorrCoef function against numpy corrcoef\n",
+    "x = [1, 2, 3, 5]\n",
+    "y = [1, 5, 7, 8]\n",
+    "print(FR4_pearsonCorrCoef(x, y))\n",
+    "\n",
+    "import numpy as np\n",
+    "print(np.corrcoef(x, y))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "### Requirement FR5 - Develop a function to generate a set of Pearson Correlation Coefficients for a given data file "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def FR5_PCCs_from_csv(filename):\n",
+    "    '''\n",
+    "    Function to calculate Pearson Correlation Coefficient (PCC) for all combinations of columns in a csv file, where each column is a variable, with column header as variable name.\n",
+    "    '''\n",
+    "\n",
+    "    # Read csv file into variable as dictionary\n",
+    "    my_dict = FR3_read_csv_into_dictionary(filename)\n",
+    "\n",
+    "    # Iterate through dictionary to calculate PCC for all combinations of variables, using FR4_pearsonCorrCoef function\n",
+    "    PCC_list_of_tuples = [(variable, variable2, round(FR4_pearsonCorrCoef(my_dict[variable], my_dict[variable2]), 5))\n",
+    "           for variable in my_dict\n",
+    "           for variable2 in my_dict]\n",
+    "    #print(PCC_list_of_tuples)\n",
+    "    return PCC_list_of_tuples\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[('age', 'age', 1.0), ('age', 'pop', -0.02671), ('age', 'share_white', 0.19961), ('age', 'share_black', -0.08807), ('age', 'share_hispanic', -0.13679), ('age', 'personal_income', 0.03248), ('age', 'household_income', 0.07123), ('age', 'poverty_rate', -0.11502), ('age', 'unemployment_rate', -0.08924), ('age', 'uni_education_25+\\n', -0.01555), ('pop', 'age', -0.02671), ('pop', 'pop', 1.0), ('pop', 'share_white', 0.07551), ('pop', 'share_black', -0.1562), ('pop', 'share_hispanic', 0.06195), ('pop', 'personal_income', 0.20486), ('pop', 'household_income', 0.30517), ('pop', 'poverty_rate', -0.29133), ('pop', 'unemployment_rate', -0.21784), ('pop', 'uni_education_25+\\n', 0.11698), ('share_white', 'age', 0.19961), ('share_white', 'pop', 0.07551), ('share_white', 'share_white', 1.0), ('share_white', 'share_black', -0.54497), ('share_white', 'share_hispanic', -0.57744), ('share_white', 'personal_income', 0.35839), ('share_white', 'household_income', 0.32212), ('share_white', 'poverty_rate', -0.49771), ('share_white', 'unemployment_rate', -0.38967), ('share_white', 'uni_education_25+\\n', 0.33416), ('share_black', 'age', -0.08807), ('share_black', 'pop', -0.1562), ('share_black', 'share_white', -0.54497), ('share_black', 'share_black', 1.0), ('share_black', 'share_hispanic', -0.26242), ('share_black', 'personal_income', -0.28248), ('share_black', 'household_income', -0.34674), ('share_black', 'poverty_rate', 0.43067), ('share_black', 'unemployment_rate', 0.48363), ('share_black', 'uni_education_25+\\n', -0.21296), ('share_hispanic', 'age', -0.13679), ('share_hispanic', 'pop', 0.06195), ('share_hispanic', 'share_white', -0.57744), ('share_hispanic', 'share_black', -0.26242), ('share_hispanic', 'share_hispanic', 1.0), ('share_hispanic', 'personal_income', -0.22313), ('share_hispanic', 'household_income', -0.13596), ('share_hispanic', 'poverty_rate', 0.20829), ('share_hispanic', 'unemployment_rate', 0.01475), ('share_hispanic', 'uni_education_25+\\n', -0.29098), ('personal_income', 'age', 0.03248), ('personal_income', 'pop', 0.20486), ('personal_income', 'share_white', 0.35839), ('personal_income', 'share_black', -0.28248), ('personal_income', 'share_hispanic', -0.22313), ('personal_income', 'personal_income', 1.0), ('personal_income', 'household_income', 0.83196), ('personal_income', 'poverty_rate', -0.69592), ('personal_income', 'unemployment_rate', -0.50493), ('personal_income', 'uni_education_25+\\n', 0.71661), ('household_income', 'age', 0.07123), ('household_income', 'pop', 0.30517), ('household_income', 'share_white', 0.32212), ('household_income', 'share_black', -0.34674), ('household_income', 'share_hispanic', -0.13596), ('household_income', 'personal_income', 0.83196), ('household_income', 'household_income', 1.0), ('household_income', 'poverty_rate', -0.75418), ('household_income', 'unemployment_rate', -0.51), ('household_income', 'uni_education_25+\\n', 0.6729), ('poverty_rate', 'age', -0.11502), ('poverty_rate', 'pop', -0.29133), ('poverty_rate', 'share_white', -0.49771), ('poverty_rate', 'share_black', 0.43067), ('poverty_rate', 'share_hispanic', 0.20829), ('poverty_rate', 'personal_income', -0.69592), ('poverty_rate', 'household_income', -0.75418), ('poverty_rate', 'poverty_rate', 1.0), ('poverty_rate', 'unemployment_rate', 0.59169), ('poverty_rate', 'uni_education_25+\\n', -0.46034), ('unemployment_rate', 'age', -0.08924), ('unemployment_rate', 'pop', -0.21784), ('unemployment_rate', 'share_white', -0.38967), ('unemployment_rate', 'share_black', 0.48363), ('unemployment_rate', 'share_hispanic', 0.01475), ('unemployment_rate', 'personal_income', -0.50493), ('unemployment_rate', 'household_income', -0.51), ('unemployment_rate', 'poverty_rate', 0.59169), ('unemployment_rate', 'unemployment_rate', 1.0), ('unemployment_rate', 'uni_education_25+\\n', -0.46639), ('uni_education_25+\\n', 'age', -0.01555), ('uni_education_25+\\n', 'pop', 0.11698), ('uni_education_25+\\n', 'share_white', 0.33416), ('uni_education_25+\\n', 'share_black', -0.21296), ('uni_education_25+\\n', 'share_hispanic', -0.29098), ('uni_education_25+\\n', 'personal_income', 0.71661), ('uni_education_25+\\n', 'household_income', 0.6729), ('uni_education_25+\\n', 'poverty_rate', -0.46034), ('uni_education_25+\\n', 'unemployment_rate', -0.46639), ('uni_education_25+\\n', 'uni_education_25+\\n', 1.0)]\n"
+     ]
+    }
+   ],
+   "source": [
+    "FR5_output = FR5_PCCs_from_csv('task1.csv')\n",
+    "print(FR5_output)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "### Requirement FR6 - Develop a function to print a custom table"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Function to calculate max column width, used in FR6\n",
+    "def max_col_width(tup_list):\n",
+    "    '''\n",
+    "    Function to calculate the maximum column width for a list of tuples'''\n",
+    "    max_cols = 0\n",
+    "    for row in tup_list:\n",
+    "        max_cols = max(max_cols, len(row))\n",
+    "\n",
+    "    col_widths = [0] * max_cols\n",
+    "    for row in tup_list:\n",
+    "        for col, value in enumerate(row):\n",
+    "            col_widths[col] = max(col_widths[col], len(str(value)))\n",
+    "    return max(col_widths)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 70,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def FR6_print_table(tup_list, *col_headers, pad_char = '*'):\n",
+    "    '''\n",
+    "    Function which takes a list of tuples, columns to include (as *arguments) and optional single padding character (defaulted to '*') as parameters.  The padding character is used to create a table with a border.\n",
+    "    '''\n",
+    "\n",
+    "    # if no column headers are provided, use all unique column headers from tup_list\n",
+    "    if not col_headers:\n",
+    "        col_headers = sorted(set([x[0] for x in tup_list]))\n",
+    "    else:\n",
+    "        col_headers = col_headers\n",
+    "    # create list of unique row headers (same as cols)\n",
+    "    row_headers = col_headers\n",
+    "    \n",
+    "    # calculate maximum column width in the data\n",
+    "    max_width = int(max_col_width(tup_list) * 1.9)\n",
+    "    \n",
+    "    # create table string with top border based on padding character and maximum column width\n",
+    "    table_str =  ' ' * int(max_width/2) + pad_char * (max_width * (len(col_headers))) + pad_char * (len(col_headers)+1)+ '\\n' \n",
+    "    table_str += ' ' * int(max_width/2) \n",
+    "\n",
+    "    # add column headers to table string, using padding character and maximum column width\n",
+    "    for col in col_headers:\n",
+    "        table_str += f\"{col:^{(max_width)+1}}\" \n",
+    "    \n",
+    "    table_str += '\\n' \n",
+    "    table_str += ' ' * int(max_width/2) + pad_char * (max_width * (len(col_headers))) +  pad_char * (len(col_headers)+1)+'\\n' \n",
+    "\n",
+    "    # add row headers and values to table string, using padding character and maximum column width\n",
+    "    for row in row_headers:\n",
+    "        table_str += f\"{row:<{int(max_width/2)}}\"+pad_char \n",
+    "        \n",
+    "        # Get the corresponding value (3rd element of tuple) for the current row and column; if no value, use '-'\n",
+    "        for col in col_headers:            \n",
+    "            r_val = next((x[2] for x in tup_list if x[0] == col and x[1] == row), '-') \n",
+    "            \n",
+    "            # if value is positive, add a space to the left of the value to keep the table aligned\n",
+    "            if r_val >= 0:\n",
+    "                table_str += f\" {r_val:^{max_width-1}}\" + pad_char \n",
+    "            else:\n",
+    "                table_str += f\"{r_val:^{max_width}}\" + pad_char\n",
+    "           \n",
+    "        table_str += '\\n' \n",
+    "\n",
+    "    # add bottom border to table string, using padding character and maximum column width    \n",
+    "    table_str += ' ' * int(max_width/2) + pad_char * max_width * len(col_headers) + pad_char * (len(col_headers)+1)+  '\\n\\n' \n",
+    "    \n",
+    "    # add caption for table\n",
+    "    table_str += ' ' * int(max_width/2) + \"Pearson's Correlation Coefficient for %s\" % (col_headers,)\n",
+    "\n",
+    "    # print table string\n",
+    "    print(table_str)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 71,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "                 ----------------------------------------------------------------------------------------------------------\n",
+      "                                 age                           poverty_rate                     household_income          \n",
+      "                 ----------------------------------------------------------------------------------------------------------\n",
+      "age              -                1.0               -             -0.11502             -              0.07123             -\n",
+      "poverty_rate     -             -0.11502             -                1.0               -             -0.75418             -\n",
+      "household_income -              0.07123             -             -0.75418             -                1.0               -\n",
+      "                 ----------------------------------------------------------------------------------------------------------\n",
+      "\n",
+      "                 Pearson's Correlation Coefficient for ('age', 'poverty_rate', 'household_income')\n"
+     ]
+    }
+   ],
+   "source": [
+    "FR6_print_table(FR5_output, 'age', 'poverty_rate', 'household_income', pad_char = '-')\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "# Coding Standards\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "# Process Development Report for Task 1\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### <b> Introduction </b>\n",
+    "\n",
+    "The purpose of this report is to provide a short, critical self-assessment of my code development process for Task 1 of the coursework for `UFCFVQ-15-M Programming_for_Data_Science`.  \n",
+    "\n",
+    "### <b> Code Description </b>\n",
+    "Task_1 requires writing functions in order, to ultimately calculate Pearson’s Correlation Coefficients (PCCs) for pairs of variables in a given data file, without using imported Python libraries, and printing a decent-looking table.  \n",
+    "\n",
+    "Functional requirements (FRs):\n",
+    "\n",
+    "| FR  | Description           |\n",
+    "|-----|-----------------------|\n",
+    "| FR1 | Arithmetic mean       |   \n",
+    "| FR2 | Read column from file |   \n",
+    "| FR3 | Read file             |   \n",
+    "| FR4 | PCC for two lists     |   \n",
+    "| FR5 | PCC for file          |   \n",
+    "| FR6 | Print table           |   \n",
+    "\n",
+    "The code was developed in a Jupyter notebook using a Python 3.11 kernel.  \n",
+    "\n",
+    "\n",
+    "### <b> Development Process</b>\n",
+    "\n",
+    "My development process made use of the task’s inherent structure, allowing me to plan, develop and test each FR independently, before combining as needed.  This was especially useful for more complex FRs, which required significant iteration and testing before achieving the desired results.\n",
+    "\n",
+    "\n",
+    "I used a modified crisp-dm approach, understanding the requirements, then cycling through iterations of pseudocode, Python code and testing until achieving the desired results.   I found it very effective, but also that I can occasionally go “off-piste” in the iterations, which can be time-consuming, frustrating and ultimately less productive. \n",
+    "\n",
+    "![](2022-12-18-23-11-02.png)\n",
+    "    \n",
+    "I made conscious use of “new-to-me” tools and techniques like Git, VS_Code, Jupyter notebooks, Markdown.\n",
+    "\n",
+    "### <b> Code Evaluation </b>\n",
+    "Overall, I am pleased with my code - functions achieve the requirements (as interpreted) and they <i>feel</i> efficient and robust.  \n",
+    "\n",
+    "Principles in mind when writing functions:\n",
+    "\n",
+    "* Future-proofed:  generic, flexible, adaptable to allow reusability\n",
+    "* User-friendly, by adding assertions and error-handling\n",
+    "* Unambiguous, self-explanatory naming of functions and variables\n",
+    "* Helpful comments/docstrings by balancing approaches like DRY (Don’t Repeat Yourself), WET (Write Everything Twice), KISS (Keep it Simple, Stupid)\n",
+    "\n",
+    "#### <b> Strengths </b> \n",
+    "* Well-commented, functioning code\n",
+    "* Consistent Git use for version control\n",
+    "* Kept working notes\n",
+    "\n",
+    "#### <b> Improvements / To-do </b>\n",
+    "* Perhaps over-commented; erred on side of caution\n",
+    "* Establish preferred naming convention – camelCase, snake_case\n",
+    "* Learn Python conventions \n",
+    "* Don’t get side-tracked when testing\n",
+    "\t* Update pseudo code\n",
+    "\t\n",
+    "[Archived reflective notes by task](archived\\Task1_FR_reflections.md)\n",
+    "\n",
+    "\n",
+    "#### <b> Summary </b>\n",
+    "I found this task both appealing and beneficial.  It allowed me to build a useful function from the ground up, making use of different Python coding techniques and data structures whilst also employing version control and applying appropriate metadata to the code.\n",
+    "\n",
+    "I am super-keen to keep learning for my personal and professional development, picking up best practice, standard approaches and avoiding pitfalls.  This task allowed me to practice all of this.  \n",
+    "\n",
+    "When it comes to Python, I am amazed at the many possibilities of solving the same scenario – this can make it challenging to identify the ‘best approach,’ if it exists.  This is something I will need to get used to and embrace.\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "deletable": false
+   },
+   "source": [
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">MARK: __%</p>\n",
+    "<p style=\"color:red; font-weight:bold; font-size:xx-small\">FEEDBACK: </p>"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.11.0 64-bit",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.0"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "3a85823825384e2f260493b9b35c69d8eaac198ff59bb0d6c0e72fffbde301e2"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/archived/Task1_FR_reflections.md b/archived/Task1_FR_reflections.md
new file mode 100644
index 0000000000000000000000000000000000000000..0dccd5e76c1ce277651dfc96bc8e74367a38477c
--- /dev/null
+++ b/archived/Task1_FR_reflections.md
@@ -0,0 +1,89 @@
+### FR1: arithmetic mean
+
+* easy to implement - good confidence boost
+* allows for getting into a flow of coding, including comments, docstrings, naming conventions (variables and functions), testing, etc.
+* I kept returning to the naming and commenting of early requirements - as they developed.  
+* Challenge was getting the balance right - on the one hand, I wanted to ensure that I was including adequate comments but on the other hand, I did not want to over document.  Bearing in mind that this is an assessed element of coursework submitted for a master's module, I think I have the right balance.  In 'the real world,' I would probably shorten some of the docstrings and remove some in line comments.  
+* My tactic has been to try to name variables and functions in an unambiguous, meaningful manner so that documentation is unnecessary.
+* I haven't settled on a style yet - e.g. casing (proper, camel, etc.) and use of underscores.  
+* When writing functions, I wanted them to have reuse value - that is, to keep them general.  With some requirements, this was not possible - some of my ideas would have deviated from the brief.  Examples later.  
+* I wanted to use helpful python code where relevant, e.g. try/except blocks with specific (anticipated) errors captured.  
+
+* WHat I would do differently:
+  * I need to be mindful of using restricted/reserved words as variable names.  I have used 'list' in FR1 function.  This is not a problem in this instance but it could be in others.  I will need to be more careful in future.
+  * I also would like to assess the efficiency of my functions - there are so many ways to write functions producing the same outcomes in python.  Absolutely amazed at how flexible and powerful it is - but at the moment, I am pleased when I can get my code to work.  Efficiency, elegance, etc. will come with time and experience - but I could start using time to assess processing times for different approaches. 
+
+### FR2: read single column from CSV file
+
+* Again, I wanted to make a reusable function, so I added 'delimiter' as an optional parameter.  I prefer to use pipes (|) as delimiters as they are very seldom used in text.  This is something I have found handy in data migration work I have undertaken in my current job.  Quite often - common characters like commas and 'invisible' spacing characters like tabs, new line, carriage returns, etc. can wreak havoc when migrating data.  
+* I liked this function as it was a good opportunity to use the 'with' statement.  I like that it 'closes' the file when finished, without a need for an explicit close statement.  
+* The challenge for me in this requirement was working out how to extract the header row and then proceed to the data rows.  
+* It was a good example of where I knew what I wanted to do and it seemed simple enough, but took some trial and error before I got there (plenty more to come!)
+* Also added a specific FileNotFoundError exception as I expect this to be the most common error.
+* I also added a if/else statement to account for whether the file has a header - returning results accordingly.  While this is not in the brief, I was thinking about a future proofed function which can handle both varieties.
+
+### FR3: read CSV data from a file into memory
+
+* Much of the above applies - the function docstring is probably a bit long but I wanted to be clear and complete in my documentation.
+* This function builds on FR2 - but returns a dictionary by iterating through the data rows of the file.  
+* I worked through many versions of this - for loops, enumerate, etc. - but settled for a list comprehension, once I get it working the way it should.  I am not sure if it is the most efficient way to do it, but it works.  However, I like the conciseness of list comprehensions - although they may not be as readable as a for loop.
+* In terms of variable names, I am not always clear on the standards - i, j, row, line, etc. and I vary between them - which annoys me.  Consistency, clarity, transparency, etc. are very important to me. I am not so happy with 'variable_data_dict' as a name, but couldn't find anything better - 'my_dict', 'data_dict', etc. - I am critical of them all. 
+
+
+* one problem I had with FR3 was discovered when using the function in various of FR4.  The original dictionary produced string key-value pairs - which when printing looked fine to me.  It took me some time to realise that the values were string - that they were enclosed in single quotation marks.  This was certainly a source of frustration - when something simply was not behaving the way I was anticipating.
+  * However, once I noticed, I tried to amend the list comprehension by convernting to integers with int() - which caused an error - "invalid literal for int() with base 10: '60.5'"
+  * Therefore I converted the data to floats - which may not be quite right for ages, but will not impact the calculations.
+* 
+
+### FR4: Pearson Correlation Coefficient
+
+* I spent a lot of time on this function and tried to make it work using several methodologies.  
+* Pseudo code was particularly helpful in this instance, but I have a tendency of starting with pseudo code and then deviating into a confused space of trial and error rather than updating the pseudo code.  Definitely something I can work on and take away into my current day job. 
+* Firstly I needed to remind myself how to calculate Pearson's r having only have had it calculated for me in SPSS or R.  There a couple of approaches - but the strategy is to break down the formula into standalone calculations and build the function to produce these before putting it all together to return R.  
+* I spent a lot of time trying different approaches.  Again, like last time I started with for loops and enumerate() but found that I couldn't always achieve my intentions - that I would get slightly lost in the hierarchy and multilined approach, so I used list comprehensions instead and in the end, I am pleased with the result.
+
+* As previously, I tried thinking about generalising and future-proofing the function for future use and that includes considering what tests / checks / validation I could put in place.  The most obvious source of error will be the csv file so I added some assertions to ensure that the csv file has the correct data before proceeding to calculate Pearson's r.  
+* I tested my version of Pearsons with numpy corrcoef() and it produces the same results for a small data set, which gives you that excitement of having achieved your intentions.
+
+
+### FR5: Pearson Correlation Coefficients for a file
+
+* This function builds on FR4 by calculating Pearson's r for each pair of columns in a csv file.
+* I enjoyed using FR3 and FR4 (which in turn use FR1) within this function.  That is pretty cool.
+* It does not have many lines of code but it took me a while to get it working.  I had to think about how to iterate through the columns of the csv in terms of what has been returned by FR3 and FR4 and then getting the column names into a list of tuples.
+* It was not plain sailing and there was a fair bit of frustration - another good example of where I knew exactly what I wanted to do but it was hard to get there.  I had a few versions on the go - I think I could have benefited by sticking with one and updating the pseudo code.  
+* As previously, I moved from for loops to list comprehensions as I had better success getting the results I wanted, especially returning lists of tuples.  Initially, I was returning individual lists instead of lists of tuples.  
+* The joy of success after some struggle can be very satisfying. 
+
+I would probably look to round some of the output in future iterations.
+Actually - I just added the rounding now!
+
+And finally
+
+### FR6: printing the results
+
+* I found this FR very challenging and a bit frustrating.
+* I could have benefitted significantly by sticking with my pseudo code and keeping it up to date - but once I got stuck into a cycle of trial and error, minor success and subsequent set-back, I did not rever to pseudocode and focused on cracking the issue which was hampering me.  
+* This function was created through trial and error - entirely
+* I broke it down into separate sections and tried to think how I would iterate through the list of tuples to get a decent end result.
+* Some sections worked, others didn't, changes then impacted the rest of the function and I was back to square one - hardforking!
+* Towards the end, I had functioning portions - the top table header, the column headers, the rows headers, the r-values.  This was a mess of for loops, list comprehensions, and string formatting which I needed to work through step by step.
+* I found cells in Jupyter notebooks to be invaluable - the ability to run portions of code and easily move it around is very effective and powerul.  I also found the ability to comment out sections of code very useful - I could easily comment out sections of code and then uncomment them to see how they impacted the rest of the function.
+* Finally, the pieces (rows of code) were lining up - at this point, I resorted to tweaking and assessing the impact of each change.  I discovered that I needed data specific to the input table - rather than fixed widths - so created a separate function to calculate max column widths and then played with that to get the table to look passable.  
+
+Whew - got there in the end!  
+
+Although this was more challenging to get motivated over as I am sure I would use some library to print tables without the effort and challenge - i.e. pandas, prettytable, etc.  However, in hindsight I am glad I persevered as I have learned a lot about string formatting and how to iterate through lists of tuples.  It allowed me to trouble-shoot, iterate through a requirement, reassess and rework through many cycles.  I am sure I will use this knowledge in the future.
+
+### general
+
+* pleasure of the process; also frustrating
+* process - crisp-dm - iterative
+* related to work - > importance of requirements and flexing, agility
+* pseudo code - getting into habit has been very useful (have a tendency, which I need to fight, to dive into the deep end and just manage to keep afloat)
+* getting into git (hub, lab) - but have been frustrated (authentication with gitlab - needing to switch to https from ssh, despite seemingly set up correctly)
+* single powerful tool, work environment - vs code - revolutionary
+* documentation - robust, standardised, conventions - i.e. commenting
+* intersections of domain, maths, coding
+* not enough time
+