Jump to content

Math/drmf

From Wikitech

February 6, 2014 — DRMF meeting

Attendees: Howard Cohl, Bonita Saunders, Marje McClain and Moritz Schubotz

Discussed DRMF development topics.

  1. Search options at http://drmf.instance-proxy.wmflabs.org/wiki/Special:MathSearch
    • search on '\JacobiP' and it doesn't find anything
    • search on "\JacobiP{\alpha}{\beta}{n}@{x}" then it finds a lot
    • search on "\JacobiP{?y}{?z}{?x}@{?x}" it find everything
    • search on "\JacobiP{?y}{?y}{?z}@{?x}" then it finds all matches with matching parameters alpha and beta
    • search on "\JacobiP{?y}{?w}{?z}" it gives me crazy output
    • search on "\JacobiP{?y}{?w}" it gives me nothing
    • search on \JacobiP gives me nothing
    • search on "JacobiP" gives me nothing
    • Search on "JacobiP" in the "Text pattern" gives Fatal error and a long section of error statements.
    • Search on "\JacobiP{?y}{?w}{?z}@{?x}" the same results are produced with mws, db2, and basex.
    • Search for "\JacobiP{?y}{?w}{?z}@{?x} \Ultraspherical{?a}{?b}@{?c}" does not produce any results in mws, db2, and basex.
  2. XSEDE Quarry DRMF platform
  3. Wikimedia foundation DRMF test bed
  4. MathJax for menus

February 7, 2014 — Meet and Greet with Marlon Pierce, IU, XSEDE

  • System adminstrator: Mike Lowe
  • Marlon's team: Yu (Marie) Ma (systems person) yuma@iu.edu

February 10, 2014 — Discussion with Moritz Schubotz

  • Moritz wants to set up the debian package for mathoid
    • Mathoid concerts tex or mathml to svg, it uses mathjax.
    • It's similar to parsoid
    • MathML can only be viewed by firefox users
    • Parsoid enables display for svg as well

February 19, 2014 — Discussion with Moritz

If you need per-page or partial page access restrictions, you are advised to install an appropriate content management package. MediaWiki was not written to provide per-page access restrictions, and almost all hacks or patches promising to add them will likely have flaws somewhere, which could lead to exposure of confidential data. We are not responsible for anything being leaked, leading to loss of funds or one's job.

  • need to investigate this.

antica nesting search compound searches are not main requirement

March 27, 2014 — DRMF meeting

Attendees: Howard Cohl, Bonita Saunders, Marje McClain, Roberto Costas-Santos, Moritz Schubotz

Discussed Yu (Marie) Ma issues:

  • Puppet
  • GitHub / GitHub account name? / GitHub DRMF repository
  • MediaWiki extensions
  • Future DRMF development strategies
  • MathMenu JOBAD

Discussed student projects:

  1. Build Orthogonal Polynmial chapter
    • DLMF macros for
      • Koornwinder KLSadd.tex
      • Koekoek, Swarttou & Lesky, Chapters 1, 9, 14
    • generate OP superstructure
    • generate OP Wikitext
    • Cherry Zou, Amber Liu
  2. DRMF community-arm formula data insertion MediaWiki extension
    • Brandon Alexander — SURF student
  3. DRMF MathJax formula menus MediaWiki extension
    • very specific description of the task
    • Jake Migdall — current student volunteer
    • Jimmy Li — SHIP student
  4. Semantic MediaWiki investigation
    • Teddy Corrales — current student volunteer
    • Etienne Rolly — TU-Berlin student working on db2 — finishing in April
      • MathWebSearch, db2/IBM (XML data storage), Semantic MediaWiki

Contact Ismail and Andrews at Lubbock Meeting — done.

Similarity search Moritz Schubotz — Xquery no user interface

Digital Library 2014 book global (Olver) — Lozier

April 2, 2014 — Teleconference with XSEDE staff

Attendees: Howard Cohl, Moritz Schubotz, Yu (Marie) Ma, Mike Lowe

Discussed potential student development platform for students on XSEDE server

  • Mike suggested perhaps to use OpenShift — RedHat based project (free)

Is any of this workable on our local system?

  • Marie mentioned that Marlon Pierce is a Gateway expert
  • Moritz suggested optimal platform for Open Source Development
    1. Upload to GitHub or Gerrit
    2. Do code review
    3. Merge to GitHub Repository, MediaWiki extension
    4. Then we upload to public server

Moritz has recently described standard solution for MediaWiki extension development

  • the necessary ingredients are: Git, VirtualBox, Vagrant

The above ingredients can be installed as follows (assuming they are not already installed on our system): See: https://www.mediawiki.org/wiki/MediaWiki-Vagrant

April 16, 2014 — Teleconference with Moritz

  • Aside: Moritz project in Iceland — A project related to data Market in Europe for transferring technology from Universities to Business

April 17, 2014 — DRMF meeting

Attendees: Howard Cohl, Bonita Saunders, Marje McClain

  • Discussed DRMF CICM 2014 paper
    • Need to insert relevant details corresponding to Reviewers notes
    • We will do this next week in a group meeting
    • Deadline Friday February 25th
  • Discussed student's progress
  • Need to contact High Schools to investigate volunteers for next fall
  1. Poolesville High School
  2. Richard Montgomery High School
  3. Montgomery Blair High School
  • Discussed XSEDE progress and current state
    • Student development environment — will be active at WMF cluster
    • LaTeXML server — ongoing discussion
    • stable instance of Mathoid for Chrome (desired second step)

May 12, 2014 — Teleconference(s) with Moritz Schubotz

/vagrant/mediawiki/extensions/Math/modules/MathJax/unpacked/extensions

mathJax.config = $.extend( true, {
         root: mw.config.get( 'wgExtensionAssetsPath' ) + '/Math/modules/MathJax/unpacked',
              'v1.0-compatible': false,
         menuSettings: {
              zoom: 'Click'
         },
         'HTML-CSS': {
              imageFont: null,
              mtextFontInherit: true
         },
         MathMenu: {
              showLocale: false
         },
         jax: ['input/TeX','input/MathML','output/NativeMML','output/HTML-CSS']
    }, mathJax.config );
  • PHP code
$wgResourceModules['ext.math.styles'] = array(
    'localBasePath' => _DIR_ . '/modules',
    'remoteExtPath' => 'Math/modules',
    'styles' => 'ext.math.css',
);

// MathJax module
// If you modify these arrays, update ext.math.mathjax.enabler.js to ensure
// that getModuleNameFromFile knows how to map files to MediaWiki modules.
$wgResourceModules += array(
    // This enables MathJax.
    'ext.math.mathjax.enabler' => array(
         'localBasePath' => _DIR_ . '/modules',
         'remoteExtPath' => 'Math/modules',
         'scripts' => 'ext.math.mathjax.enabler.js'
    ),
    // Main MathJax file
    'ext.math.mathjax.mathjax' => array(
         'localBasePath' => _DIR_ . '/modules/MathJax/unpacked',
         'remoteExtPath' => 'Math/modules/MathJax/unpacked',
         'scripts' => 'MathJax.js'
    ),

    // Localization data for the current language
    'ext.math.mathjax.localization' => array(
         'localBasePath' => _DIR_ . '/modules/MathJax/unpacked/localization',
         'remoteExtPath' => 'Math/modules/MathJax/unpacked/localization',
         'languageScripts' => array(
              // The localization data for 'en' are actually never used since an English fallback is always specified in MathJax's code when a string is used.
              'br' => array ( 'br/br.js', 'br/HelpDialog.js', 'br/MathMenu.js', 'br/TeX.js', 'br/FontWarnings.js', 'br/HTML-CSS.js', 'br/MathML.js' ),
              'cdo' => array ( 'cdo/cdo.js', 'cdo/HelpDialog.js', 'cdo/MathMenu.js', 'cdo/TeX.js', 'cdo/FontWarnings.js', 'cdo/HTML-CSS.js', 'cdo/MathML.js' ),
              'cs' => array ( 'cs/cs.js', 'cs/HelpDialog.js', 'cs/MathMenu.js', 'cs/TeX.js', 'cs/FontWarnings.js', 'cs/HTML-CSS.js', 'cs/MathML.js' ),
              'da' => array ( 'da/da.js', 'da/HelpDialog.js', 'da/MathMenu.js', 'da/TeX.js', 'da/FontWarnings.js', 'da/HTML-CSS.js', 'da/MathML.js' ),
              'de' => array ( 'de/de.js', 'de/HelpDialog.
  • Relevant PHP code — Frederick Wang wrote Math.php
<?php
// MediaWiki settings for Math.
// This file is managed by Puppet.
include_once "$IP/extensions/Math/Math.php";
/mnt/vagrant/settings.d/puppet-managed/10-Math.php
  • CSS script
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
 extensions: ["http://cs.jsu.edu/mathjax-ext/github/modifymenu/modifymenu.js"]
});
</script>

June 23, 2014 — Teleconference with Moritz Schubotz

  • Communicate MediaWiki API from Python web service data into MediaWiki and Data into MediaWiki
    • Create server component and then build a server and then use the server to communicate with it
    • Create a new wiki project called somesomething interface
    • Preferable to use JavaScript or PHP
  • Write data to dump, and then import it (Moritz used this for 1.5 million articles)
  • Student group at TU-Berlin — part of a regular group at TU-Berlin with 30 or 35 students
    • Group recommended using basics instead of db2
    • However db2 is much faster but is not open source
    • One can even get a copy of db2 for free for that purpose

June 24, 2014 — Teleconference with Moritz Schubotz

MathJax

  • Cervone, Krautsberger
  • Issues with the output LaTeX source
    • output generated by LaTeXML {\displaystyle{\displaystyle\sum_{k=2}^{\infty}\frac{\mathop{\zeta\/}\nolimits% \!\left(k\right)}{k}z^{k}=-\EulerConstant z+\mathop{\ln\/}\nolimits\mathop{% \Gamma\/}\nolimits\!\left(1-z\right)}}
    1. percent symbols — MathJax bug?
    2. extra displaystyle — this will resolve itself?
    3. \EulerConstant — why doesn't this expand? LaTeXML?
      • has something to do with the '@' sign
      • getting rid of the macros is done by LaTeXML
      • most likely this is a LaTeXML issue
  • switch on the debug mode, debug toolbar related to the server settings
    • $wgDebugToolbar = true;
    • $wgMathDebug=true;
    • $wgDebugMath=true;
    • At the bottom of the page, there will be output
      • if you want to regenerate the formula, then you need ?action=purge&mathpurge=true at the end of the URL.
  • file a bug at the LaTeXML GitHub repository.
    • tried to rebuild the whole problem to include the LaTeXML server and not to build the MediaWiki library, tried to convert with LaTeXML
      • the only person who can fix this is Bruce
      • Set up a LaTeXML instance, setup which only runs LaTeXML.
      • local LaTeXML instance input macros and then try to latexmlc specify
      • http://dlmf.nist.gov/LaTeXML/examples.html

NTICR

  • upload plaintext
  • Two types of search
    1. math aspect
    2. text aspect

XSEDE

In the .gz file located at http://demo.formulasearchengine.com/images/en-wiki-formulae.tar.gz, there is a text file called wFormula-nocount.txt It contains 287,201 lines of LaTeX strings. I think these strings originate from the English Wikipedia corpus.

The task is as follows. Write a bash shell script which runs the UNIX command

N=100; for (( i=0 ; i < $N ; i++ )) ; do echo $i; curl -d 'format=xhtml&whatsin=math&whatsout=math&pmml&cmml&nodefaultresources&preload=LaTeX.pool&preload=article.cls&preload=amsmath.sty&preload=amsthm.sty&preload=amstext.sty&preload=amssymb.sty&preload=eucal.sty&preload=%5Bdvipsnames%5Dxcolor.sty&preload=url.sty&preload=hyperref.sty&preload=%5Bids%5Dlatexml.sty&tex=literal:%5Csin+x%5E2' gw125.iu.xsede.org:8888 ; echo; done > output.txt ; echo "Failures: `grep '"result":null' output.txt | wc -l`/$N"

For every line of that txt file (all 287,201 of them). Each command will be different in the following way.

In the last part of the TeX in that UNIX command containing 'tex=literal:%5Csin+x%5E2', you need to replace the %5Csin+x%5E2 with the sting corresponding to each line of the text file wFormula-nocount.txt.

In summary, write a bash shell script (or perhaps a Python program, or any language you are happy with) which loops through the list of queries given in the wFormula-nocount.txt file sending a request and submitting the special characters of the TeX command. This might work without even escaping the special symbols, but perhaps they need to be escaped. What you would then need to do is to search through the output to see if the server crashed. If the output is ok, then you get 100/100 OK, otherwise will get 50/100 have failed, or something like that.

July 24, 2014 — DRMF meeting

Attendees: Moritz Schubotz, Marjorie McClain, Bonita Saunders, Jimmy Li, Alex Danoff

Moritz provided feedback on his discussions with Abdou Youssef, who is a George Washington University (GW) professor, immediate past chair of the GW Computer Science Department and primary architect of the DLMF search engine. Moritz said Abdou told him that a key problem is the need for better data to facilitate the design and development of a faster and more efficient search engine. Abdou suggested that Moritz try to tackle that problem as part of his dissertation work.

Moritz said that the DLMF search engine is tied to the Lucene search platform. If we try to use the current DLMF search engine for the DRMF, we would have to make a lot of updates, but we would not learn anything new. He would prefer that we use XQuery, a database query and programming language, which is not based on Lucene and is more flexible.

Moritz also pointed out that if we wanted to stick with the Lucene platform there is another Lucene based search engine, EuDML (see https://eudml.org/ ) that is similar to the DLMF search, but is open source and current -- lastest version came out in 2014. While it could be easily integrated into the DRMF in 2-7 days, that would be a mere programming exercise. He said we could learn more by developing our own search engine.

The other part is getting the data in the right format. While Moritz applauded the work done so far, he said we need to improve the efficiency of the seeding project. The use of LaTeXML (HSC: Why?) should be key so that we don’t have to maintain so many programs. And the identification of constraints can be done at a later phase using MediaWiki preview and bots. While it may take some time to develop this, the long term benefits of having a reliable framework will be large. Alex noted that having such a framework sounds a lot easier than the current method which involves making many responses at the command line. Bonita noted that since the seeding project can involve a lot of changing personnel over time, this (HSC: ?) framework may make it easier for new people to transition into the project.

Jimmy and Alex updated us on the work they had been doing on search and seeding, respectively. Alex’s last day is next Friday, August 1. Since he will be winding down his work next week, he asked if the DRMF Meeting could be held earlier in the week. Pending Howard Cohl’s approval, the next meeting is tentatively scheduled for Tuesday, July 29. Moritz is planning to discuss his proposed dissertation work. Abdou has tentatively said he will attend if he is at NIST on that day. correction: Next DRMF meeting is scheduled for Thursday July 31st.

October 16, 2014 — DRMF meeting

Attendees: Howard Cohl, Moritz Schubotz, Marje McClain, Bonita Saunders, Cherry Zou, Shraeya Madhu, Azeem Mohammed

Topics of discussion:

  • Howard Cohl — OPSFA13 Minisymposium on Digital Libraries
  • Moritz Schubotz
    • Mathoid at Wikipedia (forwarded to LaTeXML mailing list)
    • GitHub two-factor authentication — he says to just try it (read GitHub)
    • Mathematica to DLMF macro conversion for formulas (TU-Berlin student project)
  • Marje McLain
    • Annotation/Metadata for KLS data
      • Highlighed Koornwinder Addendum metadata (will work on Proof metadata)
      • How do we deal with citations in Koornwinder Addendum
    • Trying to convince her to speak at XSEDE meeting July 26-30, 2015
https://www.xsede.org/web/conference/xsede15
  • Bonita Saunders
    • Poolesville High School visit
      • Scheduled for Wednesday November 12th
      • Mark Curran and Teresa Mallow
    • Create Numerical OPSFA13 Minisymposium (Live Tables)
  • Cherry Zou — Forwarded her Siemens document
  • Shraeya Madhu — Working on DLMF Chapters 5 (gamma) and 15 (hypergeometric)
  • Azeem Mohammed — Working on Wikitext generation using LaTeXML and XSLT style sheets
    • Splitting working with metadata
    • Have to work on symbols list
    • Need to get Glossary data from Moritz
    • XSEDE LaTeXML server working ok — Fred Wang
  • Bruce Miller says DLMF website is superset of DLMF book
    • Reccomended \def\foo#1{\footnote{FOO: #1}} for Azeem's issue
  • Comma Separated Value (CSV) format specification