**Topic**

In this assignment, you should work with

__Download books.csv__

file. This file contains the detailed information about books scraped via the Goodreads . The dataset is downloaded from Kaggle website.

Each row in the file includes ten columns. Detailed description for each column is provided in the following:

**bookID**: A unique Identification number for each book.

**title**: The name under which the book was published.

**authors**: Names of the authors of the book. Multiple authors are delimited with -.

**average_rating**: The average rating of the book received in total.

**isbn**: Another unique number to identify the book, the International Standard Book Number.

**isbn13**: A 13-digit ISBN to identify the book, instead of the standard 11-digit ISBN.

**language_code**: Helps understand what is the primary language of the book.

**num_pages**: Number of pages the book contains.

**ratings_count**: Total number of ratings the book received.

**text_reviews_count**: Total number of written text reviews the book received.

**Task**

Write the following codes:

Use pandas to read the file as a dataframe (named as books).**bookID**column should be the index of the dataframe.

Use books.head() to see the first 5 rows of the dataframe.

Use book.shape to find the number of rows and columns in the dataframe.

Use books.describe() to summarize the data.

Use books[‘authors’].describe() to find about number of unique authors in the dataset and also most frequent author.

Use OLS regression to test if average rating of a book is dependent to number of pages, number of ratings, and total number of written text reviews the book received.

Summarize your findings in part 1 (**all 6 sections**) in a Word file (you should include your code, and provide a summary that contains a summary of results such as number of rows in the dataset,

**interpretation of regression results**