How to use LibreOffice to convert document formats under CentOS

How to use LibreOffice to convert document formats under CentOS

Project requirements require some preprocessing of uploaded documents. If the user uploads a document in doc format, it needs to be processed into docx or pdf format so that the subsequent process can extract the document content.

I first tried the phpoffice/phpword package and found that its conversion of doc was not ideal. This package is more suitable for generating documents based on content rather than converting documents, which is not very suitable for my needs.

Then I discovered the open source tool LibreOffice. After using it, I found it to be very effective, so I’d like to share it with you.

The server is CentOS7. Use yum to install LibreOffice directly. It takes about 600MB+ of disk space:

# You can delete it before installing it to prevent it from being installed before. yum remove libreoffice-*
yum install libreoffice

After waiting for the installation to complete, confirm the version. Although the official version is 6.1, the yum package is still 5.3.6, but there is no problem in using it. Here I still recommend that you use your own Linux system package management tool to install it, which can save a lot of trouble.

[root@localhost /]# soffice --version
LibreOffice 5.3.6.1 30 (Build: 1)

If you don't know how to use it, you can use soffice --help to see the help. There are many parameters and usage cases. The format conversion is very simple:

soffice --headless --convert-to docx /opt/upload/source/123.doc --outdir /opt/upload/source

The above command converts the /opt/upload/source/123.doc file into docx format and outputs it to the / opt/upload/source folder.

By default:

  1. The output file will be saved with the source file name + new extension;
  2. It will overwrite the existing file with the same name in outdir;

A successful conversion will output something like this:

convert /opt/upload/source/123.doc -> /opt/upload/source/123.docx using filter : MS Word 2007 XML
Overwriting: /opt/upload/source/123.docx

LibreOffice will automatically match the format filter according to the file format. As for which formats it supports, you can refer to the official website.

Summarize

The above is the editor's introduction to the method of using LibreOffice under CentOS to achieve document format conversion. I hope it will be helpful to everyone. If you have any questions, please leave me a message and I will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!
If you find this article helpful, please feel free to reprint it and please indicate the source. Thank you!

You may also be interested in:
  • Demonstration of building ElasticSearch middleware and common interfaces under centos7 in Linux system
  • Summary of common commands for building ZooKeeper3.4 middleware under centos7

<<:  mysql8.0 windows x64 zip package installation and configuration tutorial

>>:  How to implement parent-child component communication with Vue

Recommend

W3C Tutorial (12): W3C Soap Activity

Web Services are concerned with application-to-ap...

Solve the problem of using linuxdeployqt to package Qt programs in Ubuntu

I wrote some Qt interface programs, but found it ...

Installing the ping tool in a container built by Docker

Because the Base images pulled by Docker, such as...

Detailed introduction to CSS priority knowledge

Before talking about CSS priority, we need to und...

Learn the common methods and techniques in JS arrays and become a master

Table of contents splice() Method join() Method r...

MySQL 5.7.25 installation and configuration method graphic tutorial

There are two types of MySQL installation files, ...

innodb_flush_method value method (example explanation)

Several typical values ​​of innodb_flush_method f...

The difference between float and position attributes in CSS layout

CSS Layout - position Property The position attri...

How to write configuration files and use MyBatis simply

How to write configuration files and use MyBatis ...

Detailed explanation of two quick ways to write console.log in vscode

(I) Method 1: Define it in advance directly in th...

Why TypeScript's Enum is problematic

Table of contents What happened? When to use Cont...

Summary of CSS usage tips

Recently, I started upgrading my blog. In the proc...

MySQL 8.0.21.0 Community Edition Installation Tutorial (Detailed Illustrations)

1. Download MySQL Log in to the MySQL official we...

How to add java startup command to tomcat service

My first server program I'm currently learnin...