Sisyphus repository
Last update: 1 october 2023 | SRPMs: 18631 | Visits: 37373442
en ru br
ALT Linux repos
S:5.3.0-alt1
5.0: 2.03-alt2
4.1: 2.01-alt1

Group :: Graphics
RPM: tesseract

 Main   Changelog   Spec   Patches   Sources   Download   Gear   Bugs and FR  Repocop 

#set_verify_elf_method none

Name: tesseract
Version: 5.3.0
Release: alt1

Summary: Tesseract Open Source OCR Engine
Summary(ru_RU.UTF-8): Движок распознавания текста с открытым исходным кодом

License: Apache-2.0
Group: Graphics
Url: https://github.com/tesseract-ocr

Packager: Andrey Cherepanov <cas at altlinux.org>

# Source-url: https://github.com/tesseract-ocr/tesseract/archive/refs/tags/%version.tar.gz

Source: %name-%version.tar

# installing language files /usr/share/tesseract/tessdata

Patch: tesseract-5.1.0-alt-makefile.patch

BuildRequires: gcc-c++
BuildRequires: libtiff-devel
BuildRequires: libleptonica-devel >= 1.74
BuildRequires: autoconf-archive
BuildRequires: libpango-devel
BuildRequires: libcairo-devel
BuildRequires: libicu-devel
BuildRequires: doxygen

Requires: %name-langpack-en >= 4.1.0
Requires: %name-langpack-ru >= 4.1.0

%description
This package contains an OCR engine - libtesseract and a command line
program - tesseract. Tesseract has unicode (UTF-8) support, and can recognize
more than 100 languages "out of the box". Tesseract supports various output
formats: plain text, hOCR (HTML), PDF, TSV. To improve OCR, you need to improve
the quality of the analyzed image.

%description -l ru_RU.UTF-8
Этот пакет содержит движок распознавания текста - libtesseract и программу
командной строки - tesseract. Tesseract поддерживает юникод (UTF-8) и может
распознавать более 100 языков "из коробки". Tesseract поддерживает различные
форматы вывода: txt, ocr (HTML), PDF, TSV. Чтобы улучшить распознавание текста,
необходимо улучшить качество анализируемого изображения.

%package devel
Summary: Development files for tesseract
Summary(ru_RU.UTF-8): Файлы разработки для tesseract
Group: Development/C
Requires: %name
Requires: libleptonica-devel >= 1.74

%description devel
The %name-devel package contains header file for
developing applications that use %name.

%description devel -l ru_RU.UTF-8
Пакет %name-devel содержит файлы заголовка для
разработки приложений, использующих %name.

%package doc
Summary: Tesseract OCR Tool Documentation
Summary(ru_RU.UTF-8): Документация по движку Tesseract OCR
Group: Documentation
BuildArch: noarch

%description doc
The documentation contains a description of the library functions and the
tesseract utilities. The development section has examples of teaching language
models.

%description doc -l ru_RU.UTF-8
Документация содержит описание функций библиотеки и утилит %name В разделе
разработки есть примеры обучения языковых моделей.

%prep
%setup
%patch -p2
%ifarch %e2k
# LCC autovectorization perform better than these brief SIMD snippets
sed -i "/CHECK_COMPILE_FLAG/{N;/_OPT/s/=true/=false/}" configure.ac
%add_optflags -mno-sse
%endif

%build
%autoreconf
%configure --disable-static
%make_build

# for teaching language models (dev)

%make_build training

doxygen doc/Doxyfile

%install
%makeinstall_std
%makeinstall_std training-install

# link to a non-existent file

rm -I %buildroot%_libdir/*.la

%files
%doc AUTHORS ChangeLog README.md LICENSE
%_bindir/*
%_datadir/%name/tessdata/configs
%_datadir/%name/tessdata/tessconfigs
%_datadir/%name/tessdata/pdf.ttf
%_libdir/lib*.so.5*

%files devel
%_includedir/%name
%_libdir/lib*.so
%_pkgconfigdir/%name.pc

%files doc
%doc doc/html/*

%changelog

Full changelog you can see here

 
design & coding: Vladimir Lettiev aka crux © 2004-2005, Andrew Avramenko aka liks © 2007-2008
current maintainer: Michael Shigorin