2022-07-20

Python theoretical knowledge (6) import and use of modules

This article is also available in 正體中文简体中文

1, the concept of module

In the development process of computer programs, as more and more program codes are written, the code in a file will become longer and longer, and it will become more and more difficult to maintain.

In order to write maintainable code, we group many functions into different files, so that each file contains relatively little code. Many programming languages use this way of organizing code. In Python, a .py file can be called a module.

So, what are the benefits of using modules?

The biggest benefit is that the maintainability of the code is greatly improved. Second, writing code doesn't have to start from scratch. When a module is written, it can be referenced elsewhere. When we write programs, we often refer to other modules, including Python built-in modules and modules from third parties.
Using modules can also avoid function name and variable name conflicts. Each module has an independent namespace, so functions and variables with the same name can exist in different modules, so when we write our own modules, we don’t have to consider that the names will conflict with other modules. But also pay attention to try not to conflict with built-in function names.

2. Classification of modules

There are three modules:

Built-in standard modules (also known as standard libraries) execute help('modules') to view a list of all python built-in modules
Third-party open source modules, which can be installed online via pip install module name
Custom modules

3. Third-party open source modules

PyPi
https://pypi.org/

It is an open source module library for Python. As of July 31, 2020, it has included 253763 modules contributed by Python developers all over the world, covering almost anything you want to do with Python. In fact, every Python developer can upload your own modules to this platform as long as you register an account, so that developers all over the world can easily download and use your modules.

3.1, install via setup.py

Click Download files directly on the above page, after downloading, unzip it and enter the directory, and execute the following command to complete the installation.

Python

     python setup.py build #Compile source code
python setup.py install #Install source code  
  

3.2, install via pip

implement

Python

     pip3 install requests #paramiko is the module name  
  

The pip command will automatically download the module package and complete the installation.

The software will generally be automatically installed in this subdirectory of your python installation directory

Text

     \Python installation directory\Lib\site-packages  
  

By default, the pip command will connect to the official python server abroad for download, and the speed is relatively slow. It is recommended to replace it with the source of Tsinghua University.

For more software sources, refer to the article on this site:

Comparison of domestic software sources
https://blog.tsinbei.com/archives/238/

Replacing the software source tutorial reference:

Server optimization (2) Replacing the software source
https://blog.tsinbei.com/archives/237/

4, module import and call

Python

     import random #Import the entire random module

from random import randint #Import the randint function in the random module

from random import randint as suijishu ##Import the randint function in the random module and rename it to suijishu

from random import * #Import all methods under the random module (you do not need to enter the random prefix when calling), it is highly not recommended to use

random.xxx #call

Note: Once the module is called, it is equivalent to executing the code in another py file

5, custom module

Create a .py file, you can call it a module, you can import it in another program

The module written by yourself can only be imported in the program in the current path. If you change a directory and import your own module, you will get an error that the module cannot be found. This is related to the search path of the imported module:

Comment first then view it after your comment is approved. Join QQ Group to display all hidden texts.

When you import a module, the Python interpreter will go to each directory to match the name of the module you want to import in the order listed above. As long as the module name is matched in a directory, it will be imported immediately, and will not continue to look for it. .

Note that the first element of the list is the current directory, so modules you define yourself will be imported first in the current directory.

If we want to create a module that can be called anywhere, we must make sure that your module file is at least in the lookup list of the module path.

We generally put the modules written by ourselves in a directory with the word "site-packages", and various third-party modules that we download and install from the Internet are generally placed in this directory.

6, call

6.1, os module

The os module provides many functions that allow your program to interact directly with the operating system

Python

     import os # import module
os.getcwd() # Get the current working directory, which is the directory path where the current Python script works
os.listdir() # Returns all files and directory names in the specified directory
os.remove() # remove a file
os.removedirs(r"c:\python") # remove multiple directories
os.path.isfile() # Check if the path is a file
os.path.isdir() # Check if the path is a directory
os.path.isabs() # Is it an absolute path
os.path.exists() # Check if the path exists
os.path.split() # Get the directory name and file name of the path
os.path.splitext() # split file extension
os.path.dirname() # Get the path directory name
os.path.abspath() # get the absolute path
os.path.basename() # Get the file name
os.system() # run shell command
os.getenv("HOME") # Read operating system environment variables
os.environ # Returns all environment variables of the operating system
os.environ.setdefault('HOME','/home/alex') # Set system environment variables
os.rename(old,new) # rename
os.makedirs(r"c:\python\test") # Create multi-level directories
os.mkdir("test") # create a single directory
os.stat(file) # Get file attributes
os.chmod(file) # Modify file permissions and timestamps
os.path.getsize(filename) # Get file size
os.chdir(dirname) # Change working directory to dirname
os.get_terminal_size() # Get the size of the current terminal
os.kill(10884,signal.SIGKILL) # Close the process  
  

6.2, sys module

Python

     import sys
sys.argv #List of command line parameters, the first element is the path of the program itself
sys.exit(n) #Exit the program, exit(0) when exiting normally
sys.version #Get the version information of the Python interpreter
sys.maxint #Maximum Int value
sys.path #Returns the search path of the module, using the value of the PYTHONPATH environment variable during initialization
sys.platform #return the name of the operating system platform
sys.stdout.write('please:') #Standard output , an example of eliciting a progress bar, note, it does not work on py3, you can use print instead
val = sys.stdin.readline()[:-1] #Standard input
sys.getrecursionlimit() #Get the maximum recursion level
sys.setrecursionlimit(1200) #Set the maximum recursion level
sys.getdefaultencoding() #Get the default encoding of the interpreter
sys.getfilesystemencoding #Get the default encoding of memory data stored in the file  
  

6.3, time module

In Python, there are usually several ways to represent time:

Timestamp, which represents the offset in seconds from January 1, 1970 00:00:00. Example: 1596258440.116188
Formatted time string, such as 2020-08-01 13:07:20
The tuple (struct_time) has a total of nine elements. Since Python's time module implementation mainly calls the C library, each platform may be different. On Windows: time.struct_time(tm_year=2020, tm_mon=8, tm_mday=1, tm_hour=13, tm_min=10, tm_sec=43, tm_wday=5, tm_yday=214, tm_isdst=0)

Index	Attribute	Values
0	tm_year (year)	eg 2020
1	tm_mon (month)	1 - 12
2	tm_mday (day)	1 - 31
3	tm_hour (hour)	0 - 23
4	tm_min (minutes)	0 - 59
5	tm_sec (seconds)	0 - 59
6	tm_wday (weekday)	0 - 6 (0 means Monday)
7	tm_yday (day of the year)	1 - 366
8	tm_isdst (whether daylight saving time)	default is - 1

UTC time

UTC (Coordinated Universal Time, Coordinated Universal Time) is also Greenwich Mean Time, the Universal Standard Time. In China, it is UTC+8, also known as East 8. DST (Daylight Saving Time) means daylight saving time.

methods of the time module

time.localtime([secs]): Convert a timestamp to a struct_time of the current time zone. If the secs parameter is not provided, the current time will prevail.
time.gmtime([secs]): Similar to the localtime() method, the gmtime() method is a struct_time that converts a timestamp to the UTC time zone (time zone 0).
time.time(): Returns the timestamp of the current time.
time.mktime(t): Convert a struct_time to a timestamp.
time.sleep(secs): The thread delays running for the specified time, in seconds.
time.asctime([t]): Express a tuple or struct_time representing time in this form: 'Sun Oct 1 12:04:38 2019'. If there are no arguments, time.localtime() will be passed in as an argument.
time.ctime([secs]): Convert a timestamp (floating point number in seconds) to the form of time.asctime(). If the parameter is not given or is None, time.time() will be the parameter by default. Its effect is equivalent to time.asctime(time.localtime(secs)).
time.strftime(format[, t]): Converts a tuple representing time or struct_time (as returned by time.localtime() and time.gmtime()) into a formatted time string. If t is not specified, time.localtime() will be passed.

time.strptime(string[,format]): Convert a formatted time string to struct_time. In fact it is the inverse of strftime().

6.4, datetime module

Compared with the time module, the interface of the datetime module is more intuitive and easier to call

The datetime module defines the following classes:

datetime.date: A class representing a date.

Commonly used attributes are year, month, day;

datetime.time: A class representing time. Commonly used attributes are hour, minute, second, microsecond;
datetime.datetime: Represents datetime.
datetime.timedelta: represents the time interval, that is, the length between two time points.
datetime.tzinfo: relevant information about time zones. (This class is not discussed in detail here, and interested children's shoes can refer to the python manual)

There are only a few methods we need to remember:

d=datetime.datetime.now()

Returns the current datetime date type

Python

     Methods such as d.timestamp(), d.today(), d.year, d.timetuple() can be called  
  

datetime.date.fromtimestamp()

Convert a timestamp to datetime date type

Time operation

Python

     >>> print(datetime.datetime.now())
2020-08-01 16:30:24.940736

>>> datetime.datetime.now()
datetime.datetime(2020, 8, 1, 16, 30, 42, 475749)

>>> print(datetime.datetime.now() + datetime.timedelta(4)) #Current time +4 days
2020-08-05 16:31:04.921738

>>> print(datetime.datetime.now() + datetime.timedelta(hours=4)) #Current time +4 hours
2020-08-01 20:31:19.610740  
  

Time Replacement

Python

     >>> print(d.replace(year=2999,month=11,day=30))
2999-11-30 16:28:37.857495  
  

6.5, random module

There are many places in the program that need to use random characters, such as the random verification code for logging in to the website. The random module can easily generate random strings.

Python

     >>> import random
>>> random.randrange(1,10) #Return a random number between 1-10, excluding 10
6
>>> random.randint(1,10) #Return a random number between 1-10, including 10
6
>>> random.randrange(0, 100, 2) #Randomly select an even number between 0 and 100
76
>>> random.random() #returns a random floating point number
0.9257294868672783

>>> random.choice('https://blog.tsinbei.com') #Returns a random character in a given data set
'z'
>>> random.sample('https://blog.tsinbei.com',3) #Select a specific number of characters from multiple characters
['g', 'h', 'o']

#generate random string
>>> import string
>>> ''.join(random.sample(string.ascii_lowercase + string.digits, 8))
'clvebqw4'

#rearrange
>>> a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> random.shuffle(a)
>>> a
[1, 9, 7, 0, 5, 8, 4, 2, 3, 6]  
  

6.6, pickle and json modules

pickle and json are two modules used for serialization.

json for converting between strings and python data types
pickle, used to convert between python-specific types and python data types

The pickle module provides four functions: dumps, dump, loads, load

Python

     import pickle
data = {'test':123,'url':'https://blog.tsinbei.com'}

# pickle.dumps convert the data into a string that is only recognized by the python language in a special form
p_str = pickle.dumps(data) # Note that dumps will turn the data into bytes format
print(p_str)

# pickle.dump converts the data into a string that is only recognized by the python language through a special form, and writes it to a file
with open('result.pk',"wb") as fp: # The file result.pk can be changed to other names and suffixes
    pickle.dump(data,fp)

# pickle.load load from file
with open('result.pk',"rb") as f:
    d = pickle.load(f)
    print(d)  
  

The Json module also provides four functions: dumps, dump, loads, load, the usage is consistent with pickle

Python

     import json
data = {'test':123,'url':'https://blog.tsinbei.com'}

# json.dumps converts data into strings recognized by all programming languages ​​in a special form
j_str = json.dumps(data) # Note that json dumps generate strings, not bytes
print(j_str)

#dump into the file
with open('config.json','w') as fp:
    json.dump(data,fp)

#load from file
with open("config.json") as f:
    d = json.load(f)
    print(d)  
  

6.7, hashlib module

In Python3.x, hashlib is used instead of md5 module and sha module, mainly providing SHA1, SHA224, SHA256, SHA384, SHA512, MD5 algorithms

Python

     import hashlib

# md5
md5 = hashlib.md5()
md5.update(b"https://blog.tsinbei.com")
print(md5.digest()) # Return the hash value in binary format
print(md5.hexdigest()) # Return the hash value in hexadecimal format

# If the amount of data is large, you can call update() multiple times, and the final calculation result is the same
md5 = hashlib.md5()
md5.update(b"https://")
md5.update(b"blog.tsinbei.com")
print(md5.hexdigest()) # Return the hash value in hexadecimal format

# sha1
s1 = hashlib.sha1()
s1.update(b"blog.tsinbei.com")
print(s1.hexdigest())

#sha256
s256 = hashlib.sha256()
s256.update(b"blog.tsinbei.com")
print(s256.hexdigest())

#sha512
s512 = hashlib.sha512()
s512.update(b"blog.tsinbei.com")
print(s512.hexdigest())  
  

6.8, shutil module

Python

     import shutil

shutil.copyfileobj(fsrc, fdst[,length]) # copy the contents of the file to another file

shutil.copyfile(src, dst) # copy file

shutil.copymode(src, dst) # Only copy permissions. Contents, groups, users remain unchanged

shutil.copystat(src, dst) # Only copy status information, including: mode bits, atime, mtime, flags

shutil.copy(src, dst) # copy files and permissions

shutil.copy2(src, dst) # copy files and status information

shutil.ignore_patterns(*patterns)

shutil.copytree(src, dst, symlinks=False, ignore=None) # recursively copy the folder

shutil.rmtree(path[, ignore_errors[, onerror]]) # Recursively delete files, delete the entire directory (recycle bin cannot be retrieved)

shutil.move(src, dst) # move files recursively

Other functions:

shutil.make_archive(base_name, format,…)

Create a compressed package and return the file path, such as: zip, tar

The optional parameters are as follows:

base_name: The file name of the compressed package, or the path of the compressed package. If it is only the file name, it will be saved to the current directory, otherwise it will be saved to the specified path.
Such as data_bak => save to the current path
Such as: /tmp/data_bak => save to /tmp/
format: the type of compressed package, "zip", "tar", "bztar", "gztar"
root_dir: Folder path to be compressed (default current directory)
owner: user, default current user
group: group, default current group
logger: for logging, usually a logging.Logger object

Python

     import shutil

#Pack the files in the data folder in the current directory and place them in the current program directory
ret = shutil.make_archive("data_bak", 'gztar', root_dir='data')

#Pack the files in the data folder of the C drive and place them in the C:/tmp/ directory
ret = shutil.make_archive("C:/tmp/data_bak", 'gztar', root_dir='C:/data')  
  

The processing of the compressed package by shutil is carried out by calling the two modules ZipFile and TarFile, for example:

Python

     import zipfile

# compress
z = zipfile.ZipFile('abc.zip', 'w')
z.write('a.log')
z.close()

# unzip
z = zipfile.ZipFile('abc.zip', 'r')
z.extractall(path='.')
z.close()  
  

Python

     import tarfile

# compress
t = tarfile.open('/tmp/egon.tar','w')
t.add('/test1/a.py',arcname='a.bak')
t.add('/test1/b.py', arcname='b.bak')
t.close()

# unzip
t = tarfile.open('/tmp/egon.tar','r')
t.extractall('/egon')
t.close()  
  

6.9, re module

A regular expression is a special sequence of characters that can help you easily check whether a string matches a certain pattern.
Python has added the re module since version 1.5, which provides Perl-style regular expression patterns.
The re module brings full regular expression capabilities to the Python language

Regular expression syntax can refer to the rookie tutorial:

Comment first then view it after your comment is approved. Join QQ Group to display all hidden texts.

The matching syntax of re is as follows:

re.match matches from the beginning
re.search matches containing
re.findall returns all matched characters as elements in a list
re.split uses the matched character as a list separator
re.sub matches characters and replaces
re.fullmatch matches all

re.match(pattern, string, flags=0)

From the starting position, match the specified content in the string according to the model, and match a single

pattern regular expression
string String to match
flags flag bit, used to control the matching method of regular expressions

Python

     import re
obj = re.match('\d+', '123uuasf456') #If it can match, return a callable object, otherwise return None
if obj:
print(obj.group()) # Output result: 123  
  

Flags flags

re.I(re.IGNORECASE): ignore case (the full spelling is in parentheses, the same below)
re.M(MULTILINE): multiline mode, change the behavior of '^' and '$'
re.S(DOTALL): make . match all characters including newlines
re.X(re.VERBOSE) can comment your expressions to make them more readable.

re.search(pattern, string, flags=0)

According to the model to match the specified content in the string, match a single

Python

     import re
obj = re.search('\d+', 'u123uu888asf')
if obj:
print(obj.group()) # Output result: 123  
  

re.findall(pattern, string, flags=0)

Both match and search are used to match a single value, that is, only one of the strings can be matched. If you want to match all the elements that meet the conditions in the string, you need to use findall.

Python

     import re
obj = re.findall('\d+', 'fa123uu888asf')
print(obj) # Output result: ['123', '888']  
  

re.sub(pattern, repl, string, count=0, flags=0)

Used to replace matched strings, more powerful than str.replace

Python

     >>> re.sub('[a-z]+','666','blog.tsinbei.com 666',)
'666.666.666 six six six'

>>> re.sub('\d+','|', 'alex22wupeiqi33oldboy55',count=2)
'alex|wupeiqi|oldboy55'  
  

re.split(pattern, string, maxsplit=0, flags=0)

Use the matched value as the split point to split the value into a list

Python

     >>> s='9-2*5/3+7/3*99/4*2998+10*568/14'
>>> re.split('[\*\-\/\+]',s)
['9', '2', '5', '3', '7', '3', '99', '4', '2998', '10', '568', '14']

>>> re.split('[\*\-\/\+]',s,3)
['9', '2', '5', '3+7/3*99/4*2998+10*568/14']