Weighted Ridge Regression in R

Last Updated : 23 Jul, 2025

Weighted Ridge Regression extends regular Ridge Regression by assigning different weights to data points based on their importance. This allows for more flexibility and improved model accuracy by giving more influence to reliable data points.

What is Ridge Regression?

Ridge Regression is a method used in statistics and machine learning to handle a problem called multicollinearity, which is when predictor variables are highly correlated with each other. It works by adding a penalty term to the regression equation.

What is Weighted Ridge Regression?

Weighted Ridge Regression is like a customized version of Ridge Regression. Instead of treating all data points the same, it gives more weight to some data points than others. It's a way to make the regression model more flexible and tailored to the specific importance of each data point.

The weighted ridge regression model formula :

\min_{\beta} \left\{ \frac{1}{2n} \sum_{i=1}^{n} w_i (y_i - x_i^T\beta)^2 + \lambda \sum_{j=1}^{p} \beta_j^2 \right\}

Where:

  • n: Number of observations.
  • p: Number of predictors.
  • wi: Weights assigned to each observation.
  • λ: Ridge regularization parameter.
  • yi: Observed response for the i-th observation.
  • xi: Vector of predictors for the i-th observation.
  • β: Coefficient vector to be estimated.

Implementing Weighted Ridge Regression

R
library(glmnet)

# Generate data
set.seed(123)
n <- 100  
p <- 5    
X <- matrix(rnorm(n * p), ncol = p)
y <- rnorm(n)
weights <- runif(n)  

# Specify regularization parameter lambda
lambda <- 0.1

# Fit the weighted ridge regression model
fit <- glmnet(X, y, alpha = 0, weights = weights, lambda = lambda)

# Cross-validation to select lambda
cv_fit <- cv.glmnet(X, y, alpha = 0, weights = weights)
best_lambda <- cv_fit$lambda.min

# Obtain coefficient estimates for the best lambda
coef(fit, s = best_lambda)
newX <- matrix(rnorm(10 * p), ncol = p)  # Example: 10 new observations

# Make predictions using the fitted model
predictions <- predict(fit, newx = newX, s = best_lambda)
predictions

Output:

6 x 1 sparse Matrix of class "dgCMatrix"
s1
(Intercept) -0.07566374
V1 -0.05574298
V2 0.05648674
V3 -0.08911275
V4 -0.05532924
V5 0.21264334
predictions
s1
[1,] -0.54065480
[2,] -0.19980833
[3,] -0.37937056
[4,] -0.14562695
[5,] -0.31000107
[6,] 0.02000745
[7,] -0.09252755
[8,] -0.29952834
[9,] 0.01349644
[10,] 0.09333124

  • We generate newX as an example matrix of new predictor variables, assuming that want to make predictions for 10 new observations.
  • We then use predict() function to make predictions using the fitted model fit, specifying newx = newX and s = best_lambda, where best_lambda is the lambda value selected through cross-validation.

Visualizing Coefficients

R
# Install ggplot2 if not already installed
install.packages("ggplot2")
library(ggplot2)

# Now you can run the ggplot code to visualize the coefficients
coefficients <- as.matrix(coef(fit, s = best_lambda))
coefficients_df <- data.frame(
  Variable = rownames(coefficients),
  Coefficient = coefficients[, 1],
  Sign = ifelse(coefficients[, 1] > 0, "Positive", "Negative")
)

# Create a bar plot of coefficients
ggplot(coefficients_df, aes(x = reorder(Variable, Coefficient), y = Coefficient, fill = Sign)) +
  geom_bar(stat = "identity", position = "identity", color = "black") +
  coord_flip() +
  theme_minimal() +
  labs(title = "Weighted Ridge Regression Coefficients", x = "Variable", y = "Coefficient")

Output:

gh
Weighted Ridge Regression in R

Difference Between Ridge Regression and Weighted Ridge Regression

Feature

Ridge Regression

Weighted Ridge Regression

Treatment of Data Points

Treats all data points equally.

Assigns individual weights to data points based on importance or reliability.

Handling of Multicollinearity

Adds a penalty term to shrink coefficients towards zero.

Similar to Ridge Regression, but with added capability to incorporate observation-specific weights.

Flexibility and Customization

Limited customization.

Offers more flexibility by allowing incorporation of observation-specific weights, leading to a tailored analysis.

Advantages of Weighted Ridge Regression

  1. Flexibility: It handles heterogeneous data by incorporating observation-specific weights.
  2. Improved Prediction: Leads to more accurate predictions, especially with noisy data.
  3. Robustness: Mitigates the impact of outliers and prevents overfitting.

Disadvantages of Weighted Ridge Regression

  1. Subjectivity: Assigning weights is subjective and can introduce bias.
  2. Model Instability: Sensitivity to changes in weighting scheme may affect results.
  3. Complexity: Adds complexity to modeling and requires expertise.
  4. Assumption: Relies on the independence of weights from predictor and response variables.
Comment

Explore