numpy操作方法

最新推荐文章于 2025-12-04 10:18:38 发布

原创最新推荐文章于 2025-12-04 10:18:38 发布 · 594 阅读

本内容遵循CC 4.0 BY-SA版权协议

np.unique(array, return_inverse=True)

np_utils.to_categorical(coded_id)

np.random

随机数：

rand(d0, d1, ..., dn)	随机值；给定形状，生成[0,1]之间均匀分布的随机样本 np.random.rand(2，3) [[0.42085 0.74822 0.79452][0.08833 0.85959 0.25843 ]] np.random.rand(3) [0.50288776 0.33155867 0.84370448]
randn(d0, d1, ..., dn)	同rand, 生成符合标准正态分布N(0,1) 的随机样本；同`standard_normal`([size])
randint(low,high=None,size=None,dtype)	生成在区间[low,high)上离散均匀分布的整数值;若high=None，则取值区间变为[0,low) np.random.randint(5, size=(2, 4)) [[1 0 2 3] [4 3 2 4]] np.random.randint(2, size=5) [1 0 0 0 1] np.random.randint(2,5,size=5) [2 2 3 4 3]
random_integers(low,high=None,size=None)	生成在闭区间 [low, high]离散均匀分布的整数值。若high=None，则取值区间变为[1,low] np.random.random_integers(1,5,size=(2,3)) [[4 2 4] [1 1 5]]
random_sample([size])	给定形状，返回随机的浮点数，在半开区间 [0.0, 1.0)。
random([size])	返回随机的浮点数，在半开区间 [0.0, 1.0)。（官网例子与random_sample完全一样）
ranf([size])	返回随机的浮点数，在半开区间 [0.0, 1.0)。（官网例子与random_sample完全一样）
sample([size])	返回随机的浮点数，在半开区间 [0.0, 1.0)。（官网例子与random_sample完全一样）
choice(a,size=None,replace=True,p=None)	若a为数组，则从a中选取元素；若a为单个int类型数，则选取range(a)中的数 replace是bool类型，为True，则选取的元素会出现重复；反之不会出现重复 p为数组，里面存放选到每个数的可能性，即概率
bytes(length)	返回随机字节。>>> np.random.bytes(10)

排列

shuffle(x)	现场修改序列，改变自身内容。（类似洗牌，打乱顺序） np.random.shuffle(arr)
permutation(x)	返回一个随机排列 np.random.permutation(10) np.random.permutation(array)

随机数生成器：

RandomState	Container for the Mersenne Twister pseudo-random number generator.
seed([seed])	Seed the generator.
get_state()	Return a tuple representing the internal state of the generator.
set_state(state)	Set the internal state of the generator from a tuple.

分布：

beta(a, b[, size])	贝塔分布样本，在 [0, 1]内。
binomial(n, p[, size])	二项分布的样本。
chisquare(df[, size])	卡方分布样本。
dirichlet(alpha[, size])	狄利克雷分布样本。
exponential([scale, size])	指数分布
f(dfnum, dfden[, size])	F分布样本。
gamma(shape[, scale, size])	伽马分布
geometric(p[, size])	几何分布
gumbel([loc, scale, size])	耿贝尔分布。
hypergeometric(ngood, nbad, nsample[, size])	超几何分布样本。
laplace([loc, scale, size])	拉普拉斯或双指数分布样本
logistic([loc, scale, size])	Logistic分布样本
lognormal([mean, sigma, size])	对数正态分布
logseries(p[, size])	对数级数分布。
multinomial(n, pvals[, size])	多项分布
multivariate_normal(mean, cov[, size])	多元正态分布。 `>>> mean = [0,0] >>> cov = [[1,0],[0,100]] >>> import matplotlib.pyplot as plt >>> x, y = np.random.multivariate_normal(mean, cov, 5000).T >>> plt.plot(x, y, ‘x‘); plt.axis(‘equal‘); plt.show()`
negative_binomial(n, p[, size])	负二项分布
noncentral_chisquare(df, nonc[, size])	非中心卡方分布
noncentral_f(dfnum, dfden, nonc[, size])	非中心F分布
normal([loc, scale, size])	正态(高斯)分布 `>>> mu, sigma = 0, 0.1 # mean and standard deviation >>> s = np.random.normal(mu, sigma, 1000) >>> abs(mu - np.mean(s)) < 0.01 >>> abs(sigma - np.std(s, ddof=1)) < 0.01`
pareto(a[, size])	帕累托（Lomax）分布
poisson([lam, size])	泊松分布
power(a[, size])	Draws samples in [0, 1] from a power distribution with positive exponent a - 1.
rayleigh([scale, size])	Rayleigh 分布
standard_cauchy([size])	标准柯西分布
standard_exponential([size])	标准的指数分布
standard_gamma(shape[, size])	标准伽马分布
standard_normal([size])	标准正态分布 (mean=0, stdev=1).
standard_t(df[, size])	Standard Student’s t distribution with df degrees of freedom.
triangular(left, mode, right[, size])	三角形分布
uniform([low, high, size])	均匀分布
vonmises(mu, kappa[, size])	von Mises分布
wald(mean, scale[, size])	瓦尔德（逆高斯）分布
weibull(a[, size])	Weibull 分布
zipf(a[, size])	齐普夫分布

数组属性：

属性	说明
ndarray.ndim	秩，即轴的数量或维度的数量
ndarray.shape	数组的维度，对于矩阵，n 行 m 列
ndarray.size	数组元素的总个数，相当于 .shape 中 n*m 的值
ndarray.dtype	ndarray 对象的元素类型
ndarray.itemsize	ndarray 对象中每个元素的大小，以字节为单位
ndarray.flags	ndarray 对象的内存信息
ndarray.real	ndarray元素的实部
ndarray.imag	ndarray 元素的虚部
ndarray.data	包含实际数组元素的缓冲区，由于一般通过数组的索引获取元素，所以通常不需要使用这个属性。

创建数组：

1. numpy.empty(shape, dtype = float, order = 'C')    数组元素为随机值

2. numpy.zeros(shape, dtype = float, order = 'C')   0数组

3. numpy.asarray(a, dtype = None, order = None)    a  list/tuple/list_tuple

4. numpy.frombuffer(buffer, dtype = float, count = -1, offset = 0)  动态数组，流输入 buffer='Hello World'

5. numpy.fromiter(iterable, dtype, count=-1)   迭代对象  iterable=iter([1,2,3,4,5])

6. numpy.arange(start, stop, step, dtype)  范围创建

7. np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)  等差数列

8. np.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None)  等比数列  base 参数意思是取对数的时候 log 的下标  start=base ** start  stop=base ** stop

9. for x in np.nditer(a.T, order='C')  C行序优先  F列序优先

修改数组形状：

函数	描述
`reshape`	不改变数据的条件下修改形状 np.arange(8).reshape(4,2) 共享内存，同时修改
`flat`	数组元素迭代器 for element in a.flat
`flatten`	返回一份数组拷贝，对拷贝所做的修改不会影响原始数组 ndarray.flatten(order='C')
`ravel`	返回展开数组,返回view(),会影响源数组 numpy.ravel(a, order='C')

翻转数组：

`transpose`	对换数组的维度 numpy.transpose(arr, axes)
`ndarray.T`	和 `self.transpose()` 相同 a.T
`rollaxis`	向后滚动指定的轴 numpy.rollaxis(arr, axis, start)
`swapaxes`	对换数组的两个轴 numpy.swapaxes(arr, axis1, axis2)

修改数组维度：

维度	描述
`broadcast`	产生模仿广播的对象
`broadcast_to`	将数组广播到新形状 np.broadcast_to(a,(4,4))
`expand_dims`	扩展数组的形状 numpy.expand_dims(arr, axis) arr:输入数组 axis:插入轴位置
`squeeze`	从数组的形状中删除一维条目

连接数组：

函数	描述
`concatenate`	连接沿现有轴的数组序列 numpy.concatenate((a1, a2, ...), axis)
`stack`	沿着新的轴加入一系列数组。 numpy.stack(arrays, axis)
`hstack`	水平堆叠序列中的数组（列方向）
`vstack`	竖直堆叠序列中的数组（行方向）

分割数组：

函数	数组及操作
`split`	将一个数组分割为多个子数组 numpy.split(ary, indices_or_sections, axis) np.split(a,3) np.split(a,[4,7])
`hsplit`	将一个数组水平分割为多个子数组（按列）
`vsplit`	将一个数组垂直分割为多个子数组（按行）

数组元素的添加与删除

函数	元素及描述
`resize`	返回指定形状的新数组如果新数组大小大于原始大小，则包含原始数组中的元素的副本。
`append`	将值添加到数组末尾 numpy.append(arr, values, axis=None) axis无定义时，横向加成返回一维数组
`insert`	沿指定轴将值插入到指定下标之前 numpy.insert(arr, obj, values, axis) obj：索引
`delete`	删掉某个轴的子数组，并返回删除后的新数组 Numpy.delete(arr, obj, axis)
`unique`	查找数组内的唯一元素 u,indices = numpy.unique(arr, return_index, return_inverse, return_counts)

字符串函数

函数	描述
`add()`	对两个数组的逐个字符串元素进行连接 numpy.char.add() np.char.add(['hello'],[' xyz'])
multiply()	返回按元素多重连接后的字符串 numpy.char.multiply() np.char.multiply('Runoob ',3)
`center()`	居中字符串，并使用指定字符在左侧和右侧进行填充。 np.char.center('Runoob', 20,fillchar = '*')
`capitalize()`	将字符串第一个字母转换为大写 np.char.capitalize('runoob')
`title()`	将字符串的每个单词的第一个字母转换为大写 np.char.title('i like runoob')
`lower()`	数组元素转换为小写 np.char.lower('RUNOOB')
`upper()`	数组元素转换为大写 np.char.upper(['runoob','google'])
`split()`	指定分隔符对字符串进行分割，并返回数组列表 np.char.split ('www.runoob.com', sep = '.')
`splitlines()`	返回元素中的行列表，以换行符分割 np.char.splitlines('i\nlike runoob?')
`strip()`	移除元素开头或者结尾处的特定字符 np.char.strip(['arunooba','admin','java'],'a')
`join()`	通过指定分隔符来连接数组中的元素 np.char.join([':','-'],['runoob','google']) np.char.join(':','runoob')
`replace()`	使用新字符串替换字符串中的所有子字符串 np.char.replace ('i like runoob', 'oo', 'cc')
`decode()`	数组元素依次调用`str.decode` np.char.encode('runoob', 'cp500')
`encode()`	数组元素依次调用`str.encode` np.char.decode(a,'cp500')

函数：

  舍入函数：numpy.around(a,decimals)

  三角函数：sin()、cos()、tan()

  下舍整数： numpy.floor()

  上入整数： numpy.ceil() 

  倒数： numpy.reciprocal() 函数返回参数逐元素的倒数

  幂函数： numpy.power()   np.power(a,2) np.power(a,b) [10,100,1000][1,2,3]

  加减乘除：add()，subtract()，multiply() 和 divide() np.add(a,b)

  余数： numpy.mod()  numpy.remainder() 

  最大值最小值（axis=None,0,1）： numpy.amin() 和 numpy.amax()

  最大值最小值差： numpy.ptp()

  百分位数，表示小于这个值的观察值的百分比: numpy.percentile(a, q, axis)

  中位数： numpy.median() 
 
  均值： numpy.mean() 
 
  加权平均： np.average([1,2,3, 4],weights = [4,3,2,1], returned = True) returned 参数设为 true，则返回权重的和

  标准差： std = sqrt(mean((x - x.mean())**2)) ；  np.std([1,2,3,4])

  方差：np.var([1,2,3,4])

排序：

种类	速度	最坏情况	工作空间	稳定性
`'quicksort'`（快速排序）	1	`O(n^2)`	0	否
`'mergesort'`（归并排序）	2	`O(n*log(n))`	~n/2	是
`'heapsort'`（堆排序）	3	`O(n*log(n))`	0	否

  numpy.sort(a, axis, kind, order)

  numpy.argsort() 函数返回的是数组值从小到大的索引值

  numpy.argmax() 和 numpy.argmin()函数分别沿给定轴返回最大和最小元素的索引

  numpy.nonzero() 函数返回输入数组中非零元素的索引。

  numpy.where() 函数返回输入数组中满足给定条件的元素的索引。

  numpy.lexsort() 用于对多个序列进行排序

  numpy.extract() 函数根据某个条件从数组中抽取元素，返回满条件的元素。

函数	描述
msort(a)	数组按第一个轴排序，返回排序后的数组副本。np.msort(a) 相等于 np.sort(a, axis=0)。
sort_complex(a)	对复数按照先实部后虚部的顺序进行排序。
partition(a, kth[, axis, kind, order])	指定一个数，对数组进行分区
argpartition(a, kth[, axis, kind, order])	可以通过关键字 kind 指定算法沿着指定轴对数组进行分区

副本和视图：

简单的赋值不会创建数组对象的副本。 b = a
ndarray.view() 方会创建一个新的数组对象。
使用切片创建视图修改数据会影响到原始数组
ndarray.copy() 函数创建一个副本。

矩阵：

numpy.matlib.empty(shape, dtype, order)
numpy.matlib.zeros((2,2))
numpy.matlib.ones((2,2)) 
numpy.matlib.eye(n, M,k, dtype)  对角线元素为 1，其他位置为零
numpy.matlib.identity(5, dtype = float) 函数返回给定大小的单位矩阵
numpy.matlib.rand(3,3) 函数创建一个给定大小的矩阵，数据是随机填充的。

线性代数：

函数	描述
`dot`	两个数组的点积，即元素对应相乘。
`vdot`	两个向量的点积
`inner`	两个数组的内积
`matmul`	两个数组的矩阵积
`determinant`	数组的行列式
`solve`	求解线性矩阵方程
`inv`	计算矩阵的乘法逆矩阵

matplotlib：
加载下载的字体：# zhfont1 = matplotlib.font_manager.FontProperties(fname="SimHei.ttf") 
系统字体：a=sorted([f.name for f in matplotlib.font_manager.fontManager.ttflist])

plt.title("Matplotlib demo") 
plt.xlabel("x axis caption") 
plt.plot(x,y,"ob") 
plt.show()

plt.subplot(2, 1, 1) 
plt.subplot(2, 1, 2)

plt.bar(x, y, align = 'center') 
plt.bar(x2, y2, color = 'g', align = 'center') 

hist,bins = np.histogram(a,bins = [0,20,40,60,80,100])  数据的频率分布的图形表示。 水平尺寸相等的矩形对应于类间隔，称为 bin，变量 height 对应于频率。
plt.hist(a, bins = [0,20,40,60,80,100])