PDF files are used in many android applications for displaying data in the form of images, text as well as graphics. Many applications are also required to get the data from this PDF file and display this data within the android application. So for extracting the data from PDF file pdf reader is used which is used to read pdf and get the data from PDF file.
In this article, we will be building a simple application in which we will be extracting data from PDF files in Android using Jetpack Compose.
Step by Step Implementation
Step 1: Create a New Project in Android Studio
To create a new project in the Android Studio, please refer to How to Create a new Project in Android Studio with Jetpack Compose.
Step 2: Adding dependency in build.gradle file
Navigate to Gradle Scripts > build.gradle.kts (Module :app) and add the below dependency
dependencies {
...
implementation ("com.itextpdf:itextg:5.5.10")
}
After adding this dependency simply sync your project to install it.
Refer to the following github repo to read the docs: https://github.com/itext/itextpdf
Step 3: Adding PDF file to your project
As we are extracting data from PDF files, so we will be adding PDF files to our app. To add PDF files to your app, we must create the raw folder first. Please refer to Resource Raw Folder in Android Studio to create a raw folder in android. After creating a new raw directory copy and paste your PDF file inside that “raw” folder.
You can download a sample pdf from here.
Step 4: Working with the MainActivity.kt file
Go to the MainActivity.kt file and refer to the following code. Comments are added inside the code to understand the code in more detail.
MainActivity.kt:
package com.geeksforgeeks.demo
import android.os.Bundle
import androidx.activity.ComponentActivity
import androidx.activity.compose.setContent
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.*
import androidx.compose.foundation.*
import androidx.compose.material3.*
import androidx.compose.runtime.*
import androidx.compose.ui.*
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.unit.*
import com.itextpdf.text.pdf.PdfReader
import com.itextpdf.text.pdf.parser.PdfTextExtractor
class MainActivity : ComponentActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContent {
MaterialTheme {
TextExtractor()
}
}
}
}
@Composable
fun TextExtractor() {
// initializing extracted text
val extractedText = remember {
mutableStateOf("")
}
val scrollState = rememberScrollState()
Column(
modifier = Modifier
.fillMaxWidth()
.fillMaxHeight()
.fillMaxSize()
.padding(6.dp)
.verticalScroll(scrollState),
// verticalArrangement = Arrangement.Bottom,
verticalArrangement = Arrangement.SpaceBetween,
horizontalAlignment = Alignment.CenterHorizontally
) {
// simple text for displaying our extracted text
Text(text = extractedText.value, color = Color.Black, fontSize = 12.sp)
Spacer(modifier = Modifier.height(10.dp))
// button to extract text from pdf
Button(
modifier = Modifier
.fillMaxWidth()
.padding(PaddingValues(horizontal = 20.dp, vertical = 10.dp)),
onClick = {
// call function to extract data
extractData(extractedText)
}) {
// a text for our button.
Text(modifier = Modifier.padding(6.dp), text = "Extract Text from PDF")
}
}
}
// function to extract data from pdf
private fun extractData(extractedString: MutableState<String>) {
// handle exceptions on extract data operation
try {
// initialize variable to store extracted data
var extractedText = ""
// initialize pdf reader
val pdfReader = PdfReader("res/raw/android.pdf")
// no of pages in pages
val n = pdfReader.numberOfPages
// traverse through all the pages
for (i in 0 until n) {
// append data from each page to variable
extractedText =
"""
$extractedText${
PdfTextExtractor.getTextFromPage(pdfReader, i + 1).trim { it <= ' ' }
}
""".trimIndent()
}
// display data to view
extractedString.value = extractedText
// close pdf reader instance
pdfReader.close()
}
// handle exceptions if any
catch (e: Exception) {
e.printStackTrace()
}
}