[OS X TeX] combining pdfs?

Mon Feb 7 14:15:28 CET 2005

Hi all,

By popular demand, I also post the file used to generate the MLR2004  
post COLING workshop pdf.

As I'm not a LaTeX expert at all, I will not argue on the fact that it  
is well written or...

I tried to add usefull comments, but feel free to contact me if you  
need more info. This will correctly tipeset only if you have the  
papers/##.pdf (the actual papers) in the same directory, but you can  
get the feeling...

See you all,

Gilles,

\documentclass[a4paper, 11pt, twoside]{report}
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc}
%\usepackage{layout}
\usepackage{tabularx}
\usepackage{makeidx}
%Margins
%\usepackage{a4wide}

%% HARD SETTING THE MARGIN SIZE AS MOST PAPERS ARE BIGGER THAN THE  
DEFAULT LATEX MARGINS.
%% THIS IS RATHER USEFUL FOR HEADERS.
	\oddsidemargin  0cm     %   Left margin on odd-numbered pages.
	\evensidemargin 0cm     %   Left margin on even-numbered pages.
	\marginparwidth 12 pt        %   Width of marginal notes.
	\textwidth 17cm % Width of text line.
	\textheight 24cm % Width of text line.
	\topmargin -.84 cm
	\hoffset -0.54 cm
	\voffset 0 in

\usepackage{pdfpages}

%% WE MAY USE HYPERREF TO HAVE ACTIVE LINKS IN THE PDF'S TABLE OF  
CONTENT AND AUTHOR INDEX,
%% BUT THEY SEAM TO BE QUITE BUGGY...
%\usepackage[plainpages=false, pdfpagelabels,
%            bookmarksopen]{hyperref}

%% This package allows me to put left/right headers with  
conference/paper name
\usepackage{fancyhdr}

%% WE WILL GENERATE AN AUTHOR INDEX...	
\makeindex

%% DECOMPOSING THE LIST OF AUTHORS. EACH AUTHOR WILL BE ADDED IN THE  
AUTHOR INDEX.
%% THE LIST OF AUTHORS HAS TO BE IN THE FORM: {{GivenName1}  
{FamilyName1},{GivenName2} {FamilyName2},...}
\def\indexAuthors#1{\splitAuthorsA#1,@@@@,}
\def\noauthor{@@@@}

\def\splitAuthorsA#1,{\def\tmp{#1}%
\ifx\tmp\noauthor\else\processAuthor#1\relax\expandafter\splitAuthorsA\f 
i}

\def\processAuthor#1#2{% #1 is the given name, #2 is the family name,  
used in the index
\index{#2, #1}}

%% I WANT THE INDEX TO HAVE THE SAME FORM AS A SECTION
\makeatletter
\renewenvironment{theindex}
                {\if at twocolumn
                   \@restonecolfalse
                 \else
                   \@restonecoltrue
                 \fi
                 \columnseprule \z@
                 \columnsep 35\p@
                 \twocolumn[\section*{\indexname}]%
                 \@mkboth{\MakeUppercase\indexname}%
                         {\MakeUppercase\indexname}%
                 \thispagestyle{plain}\parindent\z@
                 \parskip\z@ \@plus .3\p@\relax
                 \let\item\@idxitem}
                {\if at restonecol\onecolumn\else\clearpage\fi}
\makeatother

% Managing my own table of contents, as I need to have TITLE AND AUTHORS
\makeatletter
\newcommand{\listofpapers}{\@starttoc{ppr}}

%% THIS COMMANDS IS USED TO ADDS A PAPER IN THE TABLE OF CONTENTS.
%% IT ALSO TELL FANCYHDR PACKAGE WHAT TITLE IS TO BE USED IN ADD PAGES  
HEADERS
\newcommand{\addpaper}[3]{%
	\addcontentsline{ppr}{paper}{{\bfseries #1}\\
	{\hspace{2em}\itshape #2}}%
	\relax
	\indexAuthors{#2}
	\relax
	\ifthenelse{\equal{#3}{}}{\markright{#1}}{\markright{#3}}
	}

% This tells LaTeX the form of the papers' table of content.
\newcommand{\l at paper}[2]{\@dottedtocline{1}{0em}{0em}{#1}{\itshape #2}}

%% EFFECTIVELY INCLUDE THE PAPER'S PDF IN THE PROCEEDING. THE FIRST  
PAGE OF THE PDF WILL
%% USE AN "EMPTY" PAGE STYLE (with a page number, but no title/line)
%% PARAMS ARE #1: scaling, #2: offset, #3; submission number
%% IN MY ENVIRONMENT, ALL PAPERS ARE IN "../authors/papers/##.pdf"  
WHERE ## IS THE SUBMISSION NUMBER.
\newcommand\GSincludepdf[3]{%
	\def\GSscale{1}
	\def\GSoffset{0pt}
	\ifthenelse{\equal{#1}{}}{}{\def\GSscale{#1}}
	\ifthenelse{\equal{#2}{}}{}{\def\GSoffset{#2}}
	\includepdf[pages={1},
		pagecommand={\thispagestyle{plain}},
		offset=0pt \GSoffset,
		scale=\GSscale]{../author/papers/#3.pdf}
	\includepdf[pages={2-},
		pagecommand={},
		offset=0pt \GSoffset,
		scale=\GSscale]{../author/papers/#3.pdf}
}
\makeatother

% go to odd numbered page
%% ALL PAPERS SHOULD BEGIN AT AN ODD NUMBERED PAGE (RIGHT HAND PAGE ON  
THE PRINTED BOOK)
\newcommand{\clearemptydoublepage}{\newpage{\pagestyle{empty}\cleardoubl 
epage}}

%%%%%%%%%%%%%%%%%%%%%%%
% Article inclusion
%%%%%%%%%%%%%%%%%%%%%%%
%% EFFECTIVELY ADD AN ARTICLE (will add the article in the proceedings  
and in the TOC. I did not find a way to
%% manage automatically the author index... maybe a LaTeX guru could do  
this...
%% shortTitle is used in heade, if empty, title is used instead.
%% scale/offset are used to control the imptuted pdf (depending on the  
margin given by the authors, you may want to scale down
%% the paper).
\newcommand{\includearticle}[6]{% #1 = scale, #2 = offset,  #3 = title,  
#4 = author; #5 = shortTitle, #6 = filenum/label
\clearemptydoublepage
\relax
\addpaper{#3}{#4}{#5}
\relax
\label{#6}
\GSincludepdf{#1}{#2}{#6}
\markright{}
}

%%%%%%%%%%%%%%%%%%%%%%%
% Changing the Page Styles using fancyhdr
%%%%%%%%%%%%%%%%%%%%%%%
% Use fancy headings
\pagestyle{fancyplain}

% The rightmark is set by addpaper...
\lhead[\fancyplain{\footnotesize\thepage}{\footnotesize\thepage}]%
	{\fancyplain{\footnotesize}{\footnotesize\rightmark}}
\rhead[
	\fancyplain{\footnotesize}{\footnotesize Post COLING 2004 Workshop on  
\textit{Multilingual Linguistic Resources (MLR2004)}}]
	{\fancyplain{\footnotesize\thepage}{\footnotesize\thepage}}
\cfoot[\fancyplain{}{}]{\fancyplain{}{}}
\lfoot[\fancyplain{}{}]{\fancyplain{}{}}
\rfoot[\fancyplain{}{}]{\fancyplain{}{}}

\begin{document}

% Custom title page
\begin{titlepage}
\begin{flushleft}
{\LARGE \bf COLING 2004}\\
\vskip .5cm
{\Large The 20th International Conference on Computational Linguistics
\par
\vskip 0.2cm
Post-Conference Workshop}
\end{flushleft}
\vskip 7cm
\begin{center}
{\LARGE \bf \textsf{Proceeding of the Workshop on}\\
\vskip 0.2cm
\textsf{Multilingual Linguistic Ressources}\\
\vskip 0.3cm
\textsf{MLR2004}}
\par\vskip 1.5cm
{\large Editors: \\
Gilles Sérasset, Susan Armstrong, Christian Boitet,\\
Andrei Popescu-Belis, Dan Tufis}
\end{center}
\vskip 7cm
\begin{flushright}
{\Large August 28th, 2004
\par\vskip 0.2cm
University of Geneva, Switzerland}
\end{flushright}
\end{titlepage}

\clearemptydoublepage

% Restart the page counter
\setcounter{page}{1}
\section*{Foreword}

In an ever expanding information society, most information systems are  
now facing the ``multilingual challenge''. Multilingual language  
resources play an essential role in modern information systems. Such  
resources need to provide information on many languages in a common  
framework and should be (re)usable in many applications (for automatic  
or human use).

Many centres have been involved in national and international projects  
dedicated to building harmonised language resources and creating  
expertise in the maintenance and further development of standardised  
linguistic data. These resources include dictionaries, lexicons,  
thesauri, word-nets, and annotated corpora developed along the lines of  
best practices and recommendations. However, since the late 90's, most  
efforts in scaling up these resources remain the responsibility of the  
local authorities, usually, with very low funding (if any) and few  
opportunities for academic recognition of this work. Hence, it is not  
surprising that many of the resource holders and developers have become  
reluctant to give free access to the latest versions of their  
resources, and their actual status is therefore currently rather  
unclear.

The goal of this workshop is to study problems involved in the  
development, management and reuse of lexical resources in a  
multilingual context. Moreover, this workshop provides a forum for  
reviewing the present state of language resources. The workshop is  
meant to bring to the international community qualitative and  
quantitative information about the most recent developments in the area  
of linguistic resources and their use in applications.

The impressive number of submissions (38) to this workshop and in other  
workshops and conferences dedicated to similar topics proves that  
dealing with multilingual linguistic ressources has become a very hot  
problem in the Natural Language Processing community.

To cope with the number of submissions, the workshop organising  
committee decided to accept 16 papers from 10 countries based on the  
reviewers' recommendations. Six of these papers will be presented in a  
poster session. The papers constitute a representative selection of  
current trends in research on Multilingual Language Resources, such as  
multilingual aligned corpora, bilingual and multilingual lexicons, and  
multilingual speech resources. The papers also represent a  
characteristic set of approaches to the development of multilingual  
language resources, such as automatic extraction of information from  
corpora, combination and re-use of existing resources, online  
collaborative development of multilingual lexicons, and use of the Web  
as a multilingual language resource.

The development and management of multilingual language resources is a  
long-term activity in which collaboration among researchers is  
essential. We hope that this workshop will gather many researchers  
involved in such developments and will give them the opportunity to  
discuss, exchange, compare their approaches and strengthen their  
collaborations in the field.

The organisation of this workshop would have been impossible without  
the hard work of the program committee who managed to provide accurate  
reviews on time, on a rather tight schedule. We would also like to  
thank the Coling 2004 organising committee that made this workshop  
possible. Finally, we hope that this workshop will yield fruitful  
results for all participants.

\begin{flushright}
{\bfseries Gilles Sérasset}\\
Organising chair\\
GETA (Study Group for Machine Translation), CLIPS-IMAG laboratory\\
Université Joseph Fourier, France
\end{flushright}

%\newpage
%\layout
\newpage
\section*{Program Committee}
\vskip2em
\renewcommand{\arraystretch}{1.5}
\begin{tabularx}{\linewidth}{>{\setlength{\hsize}{.6\hsize}}X>{\setlengt 
h{\hsize}{1.4\hsize}}X}
Gilles Sérasset {\itshape (Chair)}& GETA CLIPS-IMAG, Université Joseph  
Fourier - Grenoble I, France\\
Susan Armstrong & ISSCO, Université de Genève, Switzerland\\
Pushpak Battacharya & IIT, Mumbai, India\\
Igor Boguslavski & IITP, Moscow, Russia\\
Christian Boitet & GETA CLIPS-IMAG, Université Joseph Fourier -  
Grenoble I, France\\
Pierrette Bouillon & ISSCO, Université de Genève, Switzerland\\
Jim Breen & Monash University, Australia\\
Nicoletta Calzolari & CNR, Pisa, Italy\\
Dan Cristea & University Al.I.Cuza Iasi, Romania\\
Patrick Drouin & OLST, University of Montreal,Canada\\
Sanae Fujita & NTT, Kyoto, Japan\\
Ulrich Heid & IMS-CL, University of Stuttgart, Germany\\
Hitoshi Isahara & CRL, Nara, Japan\\
Kyo Kageura & NII, Tokyo, Japan\\
Chuah Choy Kim & USM, Penang, Malaisie\\
Mathieu Mangeot & NII, Tokyo, Japan\\
Alain Polguère & OLST, University of Montreal,Canada\\
Andrei Popescu-belis & ISSCO, Université de Genève, Switzerland\\
Jean Senellart & SYSTRAN, France\\
Mandel Shi & Xiamen University, China\\
Virach Sornlertlamvanich & Thai Computational Linguistics Laboratory,  
CRL, Thailand\\
Pr. Kumiko Tanaka-Ishii & Tokyo University, Japan\\
Philippe Thoiron & CRTT, Université de Lyon 2, France\\
Dan Tufis & RACAI, Uni Bucharest, Romania\\
Michael Zock & LIMSI, Orsay, France
\end{tabularx}

\newpage

\section*{Tentative Program}
\vskip2em
\renewcommand{\arraystretch}{1.5}
\begin{tabular}{|p{.2\linewidth}|p{.7\linewidth}|}
\hline
{\bfseries Time}&{\bfseries Event}\\
\hline
08:30 --- 09:00 & Registration \& Welcome\\
\hline
09:00 --- 10:00 & Paper Session
\begin{itemize}
\item {\bfseries JMdict: a Japanese-Multilingual Dictionary}
\item {\bfseries A Generic Collaborative Platform for Multilingual  
Lexical Databases Development}
\end{itemize}
\\
\hline
10:00 --- 11:00 & Poster Session and opened discussions
\begin{itemize}
\item {\bfseries Semi-Automatic Construction of Korean-Chinese Verb  
Patterns based on Translation Equivalency}
\item {\bfseries Bilingual Sign Language Dictionary to Learn the Second  
Sign Language without Learning a Target Spoken Language}
\item {\bfseries Building Parallel Corpora for eContent Professionals}
\item {\bfseries Revising the \textsc{Wordnet Domains} Hierarchy:  
semantics, coverage and balancing}
\item {\bfseries PolyphraZ: a tool for the management of parallel  
corpora}
\item {\bfseries Multilingual Text Induced Spelling Correction}
\end{itemize}
\\
\hline
11:00 --- 11:30 & Coffee Break\\
\hline
11:30 --- 13:00 & Papers Session
\begin{itemize}
\item {\bfseries A Model for Fine-Grained Alignment of Multilingual  
Texts}
\item {\bfseries Identifying correspondences between words: an approach  
based on a bilingual syntactic analysis of French/English parallel  
corpora}
\item {\bfseries Multilingual Aligned Parallel Treebank Corpus  
Reflecting Contextual Information and Its Applications}
\end{itemize}
\\
\hline
13:00 --- 14:00 & Lunch Break\\
\hline
\end{tabular}
\clearpage
\begin{tabular}{|p{.2\linewidth}|p{.7\linewidth}|}
\hline
{\bfseries Time}&{\bfseries Event}\\
\hline
14:00 --- 15:00 & Papers Session
\begin{itemize}
\item {\bfseries A Method of Creating New Bilingual Valency Entries  
using Alternations}
\item {\bfseries Automatic Construction of a Transfer Dictionary  
Considering Directionality}
\end{itemize}
\\
\hline
15:00 --- 15:30 & Poster Session and opened discussions
\begin{itemize}
\item {\bfseries Semi-Automatic Construction of Korean-Chinese Verb  
Patterns based on Translation Equivalency}
\item {\bfseries Bilingual Sign Language Dictionary to Learn the Second  
Sign Language without Learning a Target Spoken Language}
\item {\bfseries Building Parallel Corpora for eContent Professionals}
\item {\bfseries Revising the \textsc{Wordnet Domains} Hierarchy:  
semantics, coverage and balancing}
\item {\bfseries PolyphraZ: a tool for the management of parallel  
corpora}
\item {\bfseries Multilingual Text Induced Spelling Correction}
\end{itemize}
\\
\hline
15:30 --- 16:00 & Coffee Break\\
\hline
16:00 --- 17:30 & Papers Session \begin{itemize}
\item {\bfseries Building and sharing multilingual speech resources,  
using ERIM generic platforms}
\item {\bfseries Multilinguality in ETAP-3: Reuse of Lexical Resources}
\item {\bfseries Qualitative Evaluation of Automatically Calculated  
Acception Based MLDB}
\end{itemize}
\\
\hline
17:30 --- 18:00 & Opened discussions \& Closing\\
\hline
\end{tabular}

\newpage

%\tableofcontents
\section*{Contents}
\listofpapers

\clearpage

%\includepdf[pages={-},
%          pagecommand={},
%            addtotoc={1, section, 0, Paper Template for COLING2004  
Geneva, final22}]{coling-submission-template.pdf}
%

% #22.
% Multilinguality in ETAP-3: Reuse of Lexical Resources;
% Boguslavsky Igor and Iomdin, Leonid and Sizov, Victor
\includearticle{.95}{}{Multilinguality in ETAP-3: Reuse of Lexical  
Resources}{{Igor} {Boguslavsky}, {Leonid} {Iomdin}, {Victor}  
{Sizov}}{}{final-22}

% #32.
% A Model for Fine-Grained Alignment of Multilingual Texts;
% Cyrus, Lea and Feddes, Hendrik
\includearticle{}{}{A Model for Fine-Grained Alignment of Multilingual  
Texts}{{Lea} {Cyrus}, {Hendrik} {Feddes}}{}{final-32}

% #15.
% Qualitative Evaluation of Automatically Calculated Acception Based  
MLDB;
% Teeraparbseree, Aree
\includearticle{}{}{Qualitative Evaluation of Automatically Calculated  
Acception Based MLDB}{{Aree} {Teeraparbseree}}{}{final-15}

% #26.
% Automatic Construction of a Transfer Dictionary Considering  
Directionality;
% Paik, Kyonghee and Shirai, Satoshi and Nakaiwa, Hiromi
\includearticle{}{-10}{Automatic Construction of a Transfer Dictionary  
Considering Directionality}{{Kyonghee} {Paik}, {Satoshi} {Shirai},  
{Hiromi} {Nakaiwa}}{}{final-26}

% #35.
% Building and sharing multilingual speech resources, using ERIM  
generic platforms;
% FAFIOTTE, Georges
\includearticle{.95}{-5}{Building and Sharing Multilingual Speech  
Resources Using ERIM Generic Platforms}{{Georges}  
{Fafiotte}}{}{final-35}

% #20.
% A Method of Creating New Bilingual Valency Entries using Alternations;
%  Fujita, Sanae and Bond, Francis
\includearticle{}{}{A Method of Creating New Bilingual Valency Entries  
using Alternations}{{Sanae} {Fujita}, {Francis} {Bond}}{}{final-20}

% #25.
% Identifying correspondences between words: an approach based on a  
bilingual syntactic analysis of French/English parallel corpora;
%  Ozdowska, sylwia
\includearticle{.95}{}{Identifying Correspondences Between Words: an  
Approach Based on a Bilingual Syntactic Analysis of French/English  
Parallel Corpora}{{Sylwia} {Ozdowska}}{Identifying Correspondences  
Between Words: \ldots}{final-25-adobe}

% #14.
% Multilingual Aligned Parallel Treebank Corpus Reflecting Contextual  
Information and Its Applications;
%  Uchimoto, Kiyotaka and Zhang, Yujie and Sudo, Kiyoshi and Murata,  
Masaki and Sekine,  Satoshi and Isahara, Hitoshi
\includearticle{}{-15pt}{Multilingual Aligned Parallel Treebank Corpus  
Reflecting Contextual Information and Its Applications}{{Kiyotaka}  
{Uchimoto}, {Yujie} {Zhang}, {Kiyoshi} {Sudo}, {Masaki} {Murata},  
{Satoshi} {Sekine}, {Hitoshi} {Isahara}}{}{final-14}

% #34.
% JMdict: a Japanese-Multilingual Dictionary;
%  Breen, Jim
\includearticle{}{}{JMdict: a Japanese-Multilingual Dictionary}{{Jim}  
{Breen}}{}{final-34-adobe}

% #42.
% A Generic Collaborative Platform for Multilingual Lexical Databases  
Development;
%  Sérasset, Gilles
\includearticle{}{}{A Generic Collaborative Platform for Multilingual  
Lexical Database Development}{{Gilles} {Sérasset}}{}{final-42}

% #18.
% Semi-Automatic Construction of Korean-Chinese Verb Patterns based on  
Translation Equivalency;
%  Hong, Munpyo and Kim, Young-Kil and Park, Sang-Kyu and Lee, Young-Jik
\includearticle{}{-20pt}{Semi-Automatic Construction of Korean-Chinese  
Verb Patterns Based on Translation Equivalency}{{Munpyo} {Hong},  
{Young-Kil} {Kim}, {Sang-Kyu} {Park}, {Young-Jik} {Lee}}{}{final-18}

% #10.
% Bilingual Sign Language Dictionary to Learn the Second Sign Language  
without Learning a Target Spoken Language;
%  SUZUKI, Emiko and HORIKOSHI, Mariko and KAKIHANA, Kyoko
\includearticle{.95}{-15pt}{Bilingual Sign Language Dictionary to Learn  
the Second Sign Language without Learning a Target Spoken  
Language}{{Emiko} {Suzuki}, {Mariko} {Horikoshi}, {Kyoko}  
{Kakihana}}{Bilingual Sign Language Dictionary to Learn the Second Sign  
Language\ldots}{final-10}

% #30.
% Building Parallel Corpora for eContent Professionals;
%  Gavrilidou, M. and Labropoulou, P. and Desipri, E. and Giouli, V.  
and Antonopoulos, V. and Piperidis, S.
\includearticle{.95}{}{Building Parallel Corpora for eContent  
Professionals}{{M.} {Gavrilidou}, {P.} {Labropoulou}, {E.} {Desipri},  
{V.} {Giouli}, {V.} {Antonopoulos}, {S.} {Piperidis}}{}{final-30}

% #43.
% Revising the WordNet Domains Hierarchy: semantics, coverage and  
balancing;
%  Bentivogli, Luisa and Forner, Pamela and Magnini, Bernardo and  
Pianta, Emanuele
\includearticle{.95}{-10pt}{Revising the \textsc{Wordnet Domains}  
Hierarchy: semantics, coverage and balancing}{{Luisa} {Bentivogli},  
{Pamela} {Forner}, {Bernardo} {Magnini}, {Emanuele}  
{Pianta}}{}{final-43}

% #38.
% PolyphraZ : a tool for the management of parallel corpora;
%  Hajlaoui, Najeh and Boitet, Christian
\includearticle{.95}{-10pt}{PolyphraZ: a Tool for the Management of  
Parallel Corpora}{{Najeh} {Hajlaoui}, {Christian} {Boitet}}{}{final-38}

% #40.
% Multilingual Text Induced Spelling Correction;
%  Reynaert, Martin
\includearticle{.95}{}{Multilingual Text Induced Spelling  
Correction}{{Martin} {Reynaert}}{}{final-40}

\clearemptydoublepage

%% GENERATE THE INDEX. FOR THIS TO WORK, DO NOT FORGET TO USE PDFLATEX  
COMMEND, FOLLOWED BY MAKEINDEX COMMAND
%% THEN PDFLATEX AGAIN.
\renewcommand\indexname{Authors Index}
\printindex

\end{document}

--
Gilles Sérasset
GETA-CLIPS-IMAG (UJF, INPG & CNRS)
BP 53 - F-38041 Grenoble Cedex 9
Phone: +33 4 76 51 43 80
Fax:   +33 4 76 44 66 75

--------------------- Info ---------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
           & FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Post: <mailto:MacOSX-TeX at email.esm.psu.edu>