diff --git a/Lab10/CI101-Lab10-Answer Sheet -70 points (1).pdf b/Lab10/CI101-Lab10-Answer Sheet -70 points (1).pdf new file mode 100644 index 0000000000000000000000000000000000000000..5b49587b4b9fa3d778dee81e3fac0942c196da82 Binary files /dev/null and b/Lab10/CI101-Lab10-Answer Sheet -70 points (1).pdf differ diff --git a/Lab10/CI101-Lab10-Answer Sheet -70 points (1).rtf b/Lab10/CI101-Lab10-Answer Sheet -70 points (1).rtf deleted file mode 100644 index 4dad7d96d6c7306a668661029b30c464c505c222..0000000000000000000000000000000000000000 --- a/Lab10/CI101-Lab10-Answer Sheet -70 points (1).rtf +++ /dev/null @@ -1,134 +0,0 @@ -{\rtf1\ansi\ansicpg1252\deff0\nouicompat\deflang1033\deflangfe1033{\fonttbl{\f0\froman\fprq2\fcharset0 Cambria;}} -{\*\generator Riched20 10.0.17134}{\*\mmathPr\mdispDef1\mwrapIndent1440 }\viewkind4\uc1 -\pard\widctlpar\f0\fs28 Big Data Lab Answer Sheet. \par -Please complete this answer sheet and turn it in at the beginning of class on the due date posted in LEARN. \par -Part I\par -\fs24\par -\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b\fs36 Part 1:\cell -\pard\intbl\widctlpar\b0 Answer\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b\fs32 1\par -\b0 (4 pts)\cell -\pard\intbl\widctlpar Json lays out the information like a dictioanry labels its information. It's separated to two columns. The left side is the key and the right side is the value.\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 2\par -\b0 (4 pts)\cell -\pard\intbl\widctlpar XML is formatted similar to a html file. Every content is in a tag, with subtasks specify the contents a bit more clearly\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 3\par -\b0 (4 pts)\cell -\pard\intbl\widctlpar 2015 Alabama: 2552\par -2015 Alaska: 288\par -More people live inn Alabama so there is more chance for people to die by accident\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 4\par -\b0 (6 pts)\cell -\pard\intbl\widctlpar I looked at the US food import, the information and data was fascinating. I prefer JSON, I think it's formatted more nicely.\par -\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b\fs36 Part 2:\b0\fs32\cell -\pard\intbl\widctlpar\fs36 Answer\fs32\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 5\par -\b0 (2 pts)\cell -\pard\intbl\widctlpar It prints the first line "hope is the thing with feathers" and counts how many times each word appears\par -\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 6\par -\b0 (2 pts)\cell -\pard\intbl\widctlpar Each piece is one line of the poem. \cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 7\par -\b0 (2 pts)\cell -\pard\intbl\widctlpar Counting the variables for each line is faster than counting the variables for the whole poem\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 8\par -\b0 (2 pts)\cell -\pard\intbl\widctlpar In stead of counting how many times each word appears, it changes to count how many times each character appears.\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 9\par -\b0 (2 pts)\b\fs36\cell -\pard\intbl\widctlpar\b0\fs32 Each line is splitted into words. We create an empty dictionary, then we use a for loop to loop through all the words. If the word is already in the dictionary, then add the count by one, if it's not, store in the dictionary and count the value\fs36\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b\fs32 10\par -\b0 (2 pts)\cell -\pard\intbl\widctlpar The reducer uses each word and its value from the mapper as an input and emits the number of times the word is used in the poem\par -\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 11\par -\b0 (5 pts)\cell -\pard\intbl\widctlpar The first example would count the amount of times a word appears in the line and then return emitted it. This example instead created more instances of the word in exchange for not counting them\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 12\par -\b0 (15 pts)\cell -\pard\intbl\widctlpar Alice: 81.67\par -Bob: 68.0\par -Carol: 67.0\par -Dave: 78.0\par -Eve: 63.67\par -\cell\row\trowd\trgaph108\trleft5\trbrdrl\brdrs\brdrw10 \trbrdrt\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trpaddl108\trpaddr108\trpaddfl3\trpaddfr3 -\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx1823\clbrdrl\brdrw10\brdrs\clbrdrt\brdrw10\brdrs\clbrdrr\brdrw10\brdrs\clbrdrb\brdrw10\brdrs \cellx10013 -\pard\intbl\widctlpar\qc\b 13\par -\b0 (20 pts)\cell -\pard\intbl\widctlpar Example 2 Student Scores:\par -\par -Mapper: \par -def mapper(key, value):\par - grade_map = eval(key)\par - for student in grade_map:\par - grade = grade_map[student]\par - Wmr.emit(student,grade) \par -\par -Reducer:\par - def reduce(key, values):\par - sum = 0\par - count = 0\par - for value in values:\par - sum = sum + float(value)\par - count += 1\par - if count > 0:\par - average = sum / count\par - Wmr.emit(key,average)\par -\par - OUTPUT:\par - Alice 81.6666666667\par - Bob 68.0\par - Carol 67.0\par - Dave 78.0\par - Eve 63.6666666667\par -\par - EXAPLE 3 Enrollment:\par -\par - \{ \lquote Name\rquote :\rquote ARISE Academy Charter High School\rquote , \lquote Type\rquote :\rquote CS\rquote , \lquote Enrollments\rquote :\rquote 183\rquote , \lquote Male Dropouts\rquote :\rquote 1\rquote , \lquote Female Dropouts\rquote :\rquote 1\rquote , \lquote Dropouts\rquote :\rquote 2\rquote \}\par - \{ \lquote Name\rquote :\rquote ASPIRA Bilingual Cyber Charter School\rquote , \lquote Type\rquote :\rquote CS\rquote , \lquote Enrollments\rquote :\rquote 57\rquote , \lquote Male Dropouts\rquote :\rquote 2\rquote , \lquote Female Dropouts\rquote :\rquote 6\rquote , \lquote Dropouts\rquote :\rquote 8\rquote \}\par - \{ \lquote Name\rquote :\rquote Ad Prima CS\rquote , \lquote Type\rquote :\rquote CS\rquote , \lquote Enrollments\rquote :\rquote 26\rquote , \lquote Male Dropouts\rquote :\rquote 0\rquote , \lquote Female Dropouts\rquote :\rquote 0\rquote , \lquote Dropouts\rquote :\rquote 0\rquote \} \{ \lquote Name\rquote :\rquote Alliance for Progress CS\rquote , \lquote Type\rquote :\rquote CS\rquote , \lquote Enrollments\rquote :\rquote 24\rquote , \lquote Male Dropouts\rquote :\rquote 0\rquote , \lquote Female Dropouts\rquote :\rquote 0\rquote , \lquote Dropouts\rquote :\rquote 0\rquote \}\par - \{ \lquote Name\rquote :\rquote Philadelphia City SD\rquote , \lquote Type\rquote :\rquote SD\rquote , \lquote Enrollments\rquote :\rquote 63983\rquote , \lquote Male Dropouts\rquote :\rquote 3092\rquote , \lquote Female Dropouts\rquote :\rquote 2644\rquote , \lquote Dropouts\rquote :\rquote 5736\rquote \} \par -\par -\par -Mapper:\par -\par -def mapper(key,value):\par - grade map = eval(key)\par - for student in grade map:\par - if student == "Enrollments" or student == "Male Dropouts" or student == "Female Dropouts": \par - grade = grade map[student]\par - Wmr.emit(student,grade)\par -\par -Reducer:\par -\par -def reducer(key,values):\par - count = 0\par - for value in values:\par - count += int(value)\par - Wmr.emit(key,count)\par -\par -Outpout:\par -Enrollments 65273\par -Female Dropouts 2651\par -Male Dropouts 3095\par -\cell\row -\pard\widctlpar\par -} -� \ No newline at end of file