Python for data analysis with Matplotlib

Data analysis using Python with Matplotlib

马上开始. 它是免费的哦
注册 使用您的电邮地址
Python for data analysis with Matplotlib 作者: Mind Map: Python for data analysis with Matplotlib

1. Basic single plot (functional approach)

1.1. Setup

1.1.1. import matplotlib.pyplot as plt

1.1.2. %matplotlib inline

1.1.2.1. This command only works for Jupyter notebooks and removes need to invoke plt.show() to display plots in cells

1.1.3. import numpy as np x = np.linspace(0, 5, 11) y = x ** 2

1.1.3.1. This creates two arrays of 11 numbers, with x being evenly spaced numbers between 0 and 5 inclusive, and y representing the square of each

1.1.3.2. Note: these two numpy arrays, x and y, are re-used throughout all the examples in this mindmap unless explicitly stated otherwise

1.2. plt.plot(x,y)

1.2.1. returns

1.2.1.1. see attached

1.2.1.1.1. Using the plot() function we add a new plot to the canvas, passing in an array for the x axis and another array for the y axis

1.3. plt.plot(x,y,'r')

1.3.1. returns

1.3.1.1. see attached

1.3.1.1.1. The 'r' parameter changes the line colour from its default blue ('b') to red

1.4. plt.plot(x,y) plt.xlabel('X Axis Title Here') plt.ylabel('Y Axis Title Here') plt.title('String Title Here')

1.4.1. returns

1.4.1.1. see attached

1.4.1.1.1. The xlabel() function allows us to add a label for the x-axis to our most recently added plot

1.4.1.1.2. The ylabel() function allows us to add a label for the y-axis to our most recently added plot

1.4.1.1.3. The title() function allows us to add a label at the top of the plot, representing its title

1.5. plt.plot(x,y) plt.xlabel('X Axis Title Here') plt.ylabel('Y Axis Title Here') plt.title('String Title Here')

2. Basic multi plot (functional approach)

2.1. We use the subplot() function to create multiple plots

2.1.1. Same setup as the basic single plot example

2.1.2. plt.subplot(1,2,1) plt.plot(x, y, 'r--') plt.subplot(1,2,2) plt.plot(y, x, 'g*-');

2.1.2.1. returns

2.1.2.1.1. see attached

2.1.2.2. Arguments for subplot() are:

2.1.2.2.1. plt.subplot(nrows, ncols, plot_number)

3. Create figure (canvas) and add axes (object-oriented approach)

3.1. With the object-oriented approach, the starting point is to create a new figure object

3.1.1. A figure object can be thought of as a canvas for adding one or more plots to

3.1.2. fig = plt.figure()

3.1.2.1. Returns a figure object, which we can then invoke methods and attributes from

3.1.3. With a new figure object created, our next task is to add one of more axes, depending on how many plots we intend to add to it

3.1.3.1. We use the add_axes() method to add these axes to a figure

3.1.3.1.1. We must pass in an array of 4 numbers as an argument to the add_axes() method

3.1.3.1.2. The following helps us to understand the axes values

4. Plot data to figure axes (object-oriented approach)

4.1. Let's extend the figure axes idea and plot data onto the axes with the following code

4.1.1. fig = plt.figure() axes1 = fig.add_axes([0.0,0.0,1.0,1.0]) axes1.plot(x,y) axes1.set_title("BIG PLOT") axes2 = fig.add_axes([0.2,0.5,0.4,0.3]) axes2.plot(y,x) axes2.set_title("SMALL PLOT")

4.1.1.1. returns

4.1.1.1.1. see attached

4.1.1.2. Note that x and y are arrays, and the setup was originally done as part of the basic plot functional approach

5. Multi plots using subplots() function (object-oriented approach)

5.1. Using the subplots() function, we can get a jump start into the object-oriented approach

5.2. The subplots() function actually returns a tuple of Matplotlib objects; the 1st is a figure and the 2nd is an axis

5.2.1. We can demonstrate this using tuple unpacking

5.2.1.1. fig, axes = plt.subplots()

5.2.1.1.1. type(fig)

5.2.1.1.2. type(axes)

5.3. By using subplots(), we can save ourselves the code of calling the add_axes() method on a figure object

5.3.1. We can demonstrate this by immediately calling the plot() method on the axes variable

5.3.1.1. fig, axes = plt.subplots() axes.plot(x,y)

5.3.1.1.1. returns

5.4. The subplots() function takes nrows and ncols for arguments, both of which default to 1

5.4.1. When we increase either nrows or ncols value above 1, this means that the 2nd variable in the tuple returned by the subplots() function is actually an array of axes

5.4.1.1. For example:

5.4.1.1.1. fig, axes = plt.subplots(1,2)

5.5. Here's how we can create a couple of plots using this approach

5.5.1. fig, axes = plt.subplots(nrows=1,ncols=2) axes[0].plot(x,y,'b') axes[0].set_title("Plot 1") axes[1].plot(y,x,'r') axes[1].set_title("Plot 2")

5.5.1.1. returns

5.5.1.1.1. see attached

6. Set aspect ratio and DPI for figure (canvas) using figure() function or subplot() function

6.1. When using the figure() function or subplots() function, we have the option to specify the aspect ratio (width:height) and the dots per inch (dpi)

6.1.1. Example using figure()

6.1.1.1. fig = plt.figure(figsize=(8,4),dpi=100) ax = fig.add_axes([0,0,1,1]) ax.plot(x,y)

6.1.1.1.1. Note that x and y are arrays, and the setup was originally done as part of the basic plot functional approach

6.1.1.1.2. returns

6.1.2. Examples using subplots()

6.1.2.1. fig, axes = plt.subplots(figsize=(8,4),dpi=100) axes.plot(x,y)

6.1.2.1.1. returns

6.1.2.2. fig, axes = plt.subplots(nrows=2,ncols=1,figsize=(8,4),dpi=100) axes[0].plot(x,y) axes[1].plot(y,x)

6.1.2.2.1. returns

6.2. When using the figure()

7. Save figure as image file

7.1. By invoking the savefig() method on a figure object, we have the option to save the figure in a number of supported formats

7.1.1. Supported formats include PNG, JPG, EPS, SVG, PGF and PDF

7.1.2. All we have to do when passing the 1st argument to savefig() is pass the path + file name as a string, including a supported filename extension

7.1.2.1. The method is smart enough to figure out the required format from the extension

7.1.2.2. fig.savefig("my_plot.png")

7.1.2.3. If we omitted the dpi setting for the figure, we can specify it as an argument in the savefig() method

7.1.2.3.1. fig.savefig("my_plot.png", dpi=200)

8. Zooming in on a plot using the set_xlim() and set_ylim() methods, and also the axis() method with the 'tight' option

8.1. We can imagine in a plot with many data points that it can be useful in some circumstances to effectively zoom in on a specific area of the plot to get a better picture of what's going on around a particular range of values

8.1.1. We can achieve this zooming in by calling the set_xlim() and set_ylim() methods

8.1.1.1. We can also tighten up the axes to eliminate redundant space on the x and y axis by invoking axis('tight')

8.1.1.1.1. fig, axes = plt.subplots(1, 3, figsize=(12, 4)) axes[0].plot(x, x**2, x, x**3) axes[0].set_title("default axes ranges") axes[1].plot(x, x**2, x, x**3) axes[1].axis('tight') axes[1].set_title("tight axes") axes[2].plot(x, x**2, x, x**3) axes[2].set_ylim([0, 60]) axes[2].set_xlim([2, 5]) axes[2].set_title("custom axes range");

9. Other types of plots

9.1. This mindmap covers the base line plot but Matplotlib supports many other types of plot too, such as scatter plots, histograms, box plots, etc.

9.1.1. However, we will focus on using the Seaborn library for statistical plots

10. Why Matplotlib?

10.1. Matplotlib is the most popular Python library for graphically plotting data

10.2. It gives you control over every aspect of a plot (a.k.a. "figure")

10.3. Works very well with Pandas and Numpy arrays

10.4. Visit matplotlib.org and click Examples to see all the different types of plot you can do with the library

11. Installing Matplotlib

11.1. conda install matplotlib

11.1.1. for Anaconda based Python installations

11.2. pip install matplotlib

11.2.1. for non Anaconda based Python installations

12. Import Matplotlib to create plots inside Jupyter notebook

12.1. import matplotlib.pyplot as plt

13. Magic command to enable Matplotlib plots to be rendered inside Jupyter notebook

13.1. %matplotlib inline

13.1.1. If you are not using a Jupyter notebook, you can use the following command each time you want to see your plot

13.1.1.1. plt.show()

13.1.1.1.1. Note that plt is an alias here for matplotlib.pyplot

14. Functional vs object-oriented approach

14.1. There are two different ways to generate plots using matplotlib; the functional approach is perhaps easier to pick up, but the object-oriented approach is the preferred way

14.1.1. Functional approach involves calling the matplotlib.pyplot.plot() function, followed by other functions to add further elements to your plot

14.1.1.1. Also the subplot() function when building up a multiplot canvas

14.1.2. Object-oriented approach involves creating a figure object and then invoking methods and attributes on that object to build up the plot

14.1.2.1. Think of a figure as a canvas to which you add plots

14.1.2.1.1. Each canvas can take multiple plots

15. Handle overlapping plots with tight_layout() function

15.1. When we have a Matplotlib figure with multiple plots, we can sometimes have a bit of a space issue with plots being too close to each other

15.1.1. We can auto-adjust for this by including a call to the tight_layout() function

15.1.1.1. plt.tight_layout()

16. Adding legend to a plot

16.1. We need to start thinking about legends when adding more than one series to a plot

16.1.1. When we add two (or more) plots to the axes, we should also pass in a label argument

16.1.1.1. After creating the axes plots with labels, we should then invoke the legend() method

16.1.1.1.1. fig = plt.figure() ax = fig.add_axes([0,0,1,1]) ax.plot(x, x**2, label="x squared") ax.plot(x, x**3, label="x cubed") ax.legend()

16.1.1.1.2. We can also pass in a location (loc) string or code, and there are multiple options (see attached)

17. Controlling line colour and transparency on a plot

17.1. fig, ax = plt.subplots() ax.plot(x, x+1, color="blue", alpha=0.5) # half-transparant ax.plot(x, x+2, color="#8B008B") # RGB hex code ax.plot(x, x+3, color="#FF8C00") # RGB hex code

17.1.1. Note that the color argument accepts basic colours and also custom colours via a valid RGB hex code

17.1.2. Note that we can apply transparency to a line using the alpha argument

17.1.3. returns

17.1.3.1. see attached

18. Controlling line width on a plot

18.1. fig, ax = plt.subplots(figsize=(12,6)) ax.plot(x, x+1, color="red", linewidth=0.25) ax.plot(x, x+2, color="red", linewidth=0.50) ax.plot(x, x+3, color="red", linewidth=1.00) ax.plot(x, x+4, color="red", linewidth=2.00)

18.1.1. Note that the linewidth argument can be more than the (default) 1 to increase thickness and less than 1 to decrease thickness

18.1.2. Note that linewidth argument can also be specified as lw (e.g. lw=2)

18.1.3. returns

18.1.3.1. see attached

19. Controlling line style on a plot

19.1. fig, ax = plt.subplots(figsize=(12,6)) ax.plot(x, x+5, color="green", lw=3, linestyle='-') ax.plot(x, x+6, color="green", lw=3, ls='-.') ax.plot(x, x+7, color="green", lw=3, ls=':')

19.1.1. Note that the linestyle argument is passed as a string

19.1.2. Note that linestyle argument can also be specified as ls (e.g. ls='--')

19.1.3. returns

19.1.3.1. see attached

20. Adding markers to your plot lines (for the data points)

20.1. fig, ax = plt.subplots(figsize=(12,6)) ax.plot(x, x+ 9, color="blue", lw=0.5, ls='-', marker='+') ax.plot(x, x+10, color="blue", lw=1, ls='--', marker='o') ax.plot(x, x+11, color="blue", lw=1, ls='-', marker='s') ax.plot(x, x+12, color="blue", lw=0.5, ls='--', marker='1')

20.1.1. Note that the marker argument is passed as a string

20.1.2. returns

20.1.2.1. see attached

21. Controlling marker size, face colour and edge color

21.1. fig, ax = plt.subplots(figsize=(12,6)) ax.plot(x, x+13, color="purple", lw=1, ls='-', marker='o', markersize=2) ax.plot(x, x+14, color="purple", lw=1, ls='-', marker='o', markersize=4) ax.plot(x, x+15, color="purple", lw=1, ls='-', marker='o', markersize=8, markerfacecolor="red") ax.plot(x, x+16, color="purple", lw=1, ls='-', marker='s', markersize=8, markerfacecolor="yellow", markeredgewidth=3, markeredgecolor="green");

21.1.1. Note that we increase the markersize option from its default 1 to get a more pronounced marker

21.1.2. To have the marker face colour differ from the line colour, we can specify the markerfacecolor argument

21.1.3. We can even change the thickness and colour of the marker edge using the markeredgewidth and markeredgecolor arguments

21.1.4. returns

21.1.4.1. see attached